[R] Reassign values based on multiple conditions

2013-03-15 Thread Cat Cowie
Hi all,

I have a simple data frame of three columns - one of numbers (really a
categorical variable), one of dates and one of data.

Imagine:

collar date data
1 01/01/2013 x
2 02/01/2013 y
3 04/01/2013 z
4 04/01/2013 a
5 07/01/2013 b


The 'collar' is a GPS collar that's been worn by an animal for a certain
amount of time, and then may have been worn by a different animal after
changes when the batteries needed to be changed. When an animal was caught
and the collar battery needed to be changed, a whole new collar had to be
put on, as these animals (wild boar and red deer!) were not that easy to
catch. In order to follow the movements of each animal I now need to create
a new column that assigns the 'data' by animal rather than by collar. I
have a table of dates, e.g

animal collar   start_dateend_date
 1  1  01/01/2013   03/01/2013
 1  5  04/01/2013   06/01/2013
 1  3  07/01/2013   09/01/2013
 2  2  01/01/2013   03/01/2013
 2  1  04/01/2013   06/01/2013

I have so far been able to make multi-conditional tests:

animal1test- (date=01/01/13  date=03/01/13)
animal1test2- (date=04/01/13  date=06/01/13)
animal2test- (date=04/01/13  date=06/01/13)

to use in an 'if else' formula:

 if(animal1test){
  collar[1]=animal1
  } else if(animal1test2){
collar[5]=animal1
  }else if(animal2test)
collar[1]=animal2
}else NA

As I'm sure you can see, this is completely inelegant, and also not working
for me! Any ideas on how to a achieve this?

Thanks SO much in advance,
Cat

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] column and line graphs in R

2013-03-15 Thread Marc Girondot
When you send data, use dput() to send them. It is much more easy for 
people who want to help you.
Here is an example. I am not sure if it is what you want but you can 
play with the code.

Sincerely

Marc

fungal - structure(list(rel.abund = c(0.003, 0.029, 0.033, 0.023, 0.009,
0.042, 0.069, 0.059, 0.034, 0.049, 0.084, 0.015, 0.059, 0.032,
0.142, 0.031, 0.034, 0.01, 0.011, 0.004, 0.034, 0.182), rel.freq = c(0.083,
0.167, 0.167, 0.083, 0.083, 0.25, 0.083, 0.167, 0.083, 0.083,
0.333, 0.083, 0.083, 0.167, 0.25, 0.083, 0.083, 0.083, 0.083,
0.083, 0.333, 0.417)), .Names = c(rel.abund, rel.freq),
class = data.frame, row.names = c(MOTU2,
MOTU4, MOTU6, MOTU7, MOTU9, MOTU11, MOTU14, MOTU16,
MOTU17, MOTU18, MOTU19, MOTU20, MOTU21, MOTU22, MOTU23,
MOTU24, MOTU25, MOTU29, MOTU30, MOTU33, MOTU36, MOTU34
))

premar - par(mar)
par(mar=c(5,4,4,4)+0.1)

plot(fungal[,1], type=h, lwd=20, lend=2, bty=n, xlab=, 
ylab=Relative abundance, xaxt=n, ylim=c(0,0.2))

par(xpd=TRUE)
segments(-2.5, 0.01, -2.5, 0.03, lwd=20, lend=2, col=black)

par(new=TRUE)
plot(fungal[,2], type=p, bty=n, pch=16, col=red, axes=FALSE, 
xlab=, ylab=, main=, ylim=c(0,0.5))

axis(1, at=1:length(rownames(fungal)), labels=rownames(fungal), las=2)
axis(4)
mtext(Relative frequency, side=4, line=3)
points(25.6, 0.1, pch=16, col=red)

par(mar=premar)


Le 14/03/13 15:40, Gian Maria Niccolò Benucci a écrit :

Hi again,

Thank you all for your support. I would love to have a graph in which two
variables are contemporary showed. For example a histogram and a curve
should be the perfect choice. I tried to use twoord.plot() but I am not
sure I understand how to manage the the arguments lx, ly, rx, ry... Anyway
these are my data:


nat_af

rel.abund rel.freq
MOTU2  0.0030.083
MOTU4  0.0290.167
MOTU6  0.0330.167
MOTU7  0.0230.083
MOTU9  0.0090.083
MOTU11 0.0420.250
MOTU14 0.0690.083
MOTU16 0.0590.167
MOTU17 0.0340.083
MOTU18 0.0490.083
MOTU19 0.0840.333
MOTU20 0.0150.083
MOTU21 0.0590.083
MOTU22 0.0320.167
MOTU23 0.1420.250
MOTU24 0.0310.083
MOTU25 0.0340.083
MOTU29 0.0100.083
MOTU30 0.0110.083
MOTU33 0.0040.083
MOTU36 0.0340.333
MOTU34 0.1820.417

First column is the relative abundance of the given MOTU and second column
is the relative frequency of the same MOTU.
Thank you very much in advance,




--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difficulty with UNIQUE

2013-03-15 Thread Barry King
I need to extract labels from Excel input data to use as dimnames later on.
I can successfully read the Excel data into three matrices:

capacity - read.csv(c:\\R\\data\\capacity.csv)
price.lookup - read.csv(c:\\R\\data\\price lookup.csv)
sales - read.csv(c:\\R\\data\\sales.csv)

The values to be used as dimnames are duplicated in the matrices.

For example, I would like to create

dimnames(out.table) [[3]] - c(a, b, c)

by not explicitly entering the first three letters of the alphabet but by
something
like

dimnames(out.table) [[3]] - pl.names

but I cannot generate unique values with

 pl.names - unique(with(price.lookup, list(Price_Line)))
 pl.names
[[1]]
  [1] a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a c c c c c c c
 [44] c c c c c c c c c c c c c c c c c c c c c c c c c c c c c b b b b b b
b b b b b b b b
 [87] b b b b b b b b b b b b b b b b b b b b b b
Levels: a b c

Can someone please suggest how I can grab a, b, c from

(with(price.lookup, list(Price_Line))

?

Thank you,


-- 
__
*Barry E. King, Ph.D.*
Director of Retail Operations
Qualex Consulting Services, Inc.
barry.k...@qlx.com
O: (317)940-5464
M: (317)507-0661
__

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] phyper returning zero

2013-03-15 Thread elliott harrison
Hi,
I am attempting to use phyper to test the significance of two overlapping 
lists. I keep getting a zero and wondered if that was determining 
non-significance of my overlap or a p-value too small to calculate?

overlap = 524
lista = 2784
totalpop = 54675
listb = 1296

phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F)
[1] 0

If I plug in some different values I get a p-value but since zero is actually 
lower is the overlap significant, or more likely have I made a mistake in using 
the function?
phyper(10, 100, 2, 100,lower.tail = FALSE, log.p=F)
[1] 2.582795e-12


Thanks

Elliott



This message has been scanned for malware by Websense. www.websense.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] column and line graphs in R

2013-03-15 Thread Gian Maria Niccolò Benucci
Thank you very much to you all, I'll play the codes and post my code once I
have tested it.
Cheers,


-- 
Gian




On 14 March 2013 16:27, John Kane jrkrid...@inbox.com wrote:


  The easiest way to supply data  is to use the dput() function.  Example
 with your file named testfile:
 dput(testfile)
 Then copy the output and paste into your email.  For large data sets, you
 can just supply a representative sample.  Usually,
 dput(head(testfile, 100)) will be sufficient.

 Generally speaking two y-axis scales are to be avoided if at all possible.
 Faceting is likely to give you better results although I see that the scale
 differences are annoying large. It is possible to plot the two facets of
 the graph independently in order to have two independent y-axes but it
 takes more work and may or may not be needed

 Here is a possible approach based on ggplot2 . You will probably have to
 install ggplot2 and reshape2 using install.packages()  Notice I've changed
 your variable names around and turned your data into a dataframe with the
 matrix row.names as another variable.

 ##===begin code==#

 library(reshape2)
  library(ggplot2)

   dat1-read.table(text=
 place   abund freq
 MOTU2  0.0030.083
 MOTU4  0.0290.167
 MOTU6  0.0330.167
 MOTU7  0.0230.083
 MOTU9  0.0090.083
 MOTU11 0.0420.250
 MOTU14 0.0690.083
 MOTU16 0.0590.167
 MOTU17 0.0340.083
 MOTU18 0.0490.083
 MOTU19 0.0840.333
 MOTU20 0.0150.083
 MOTU21 0.0590.083
 MOTU22 0.0320.167
 MOTU23 0.1420.250
 MOTU24 0.0310.083
 MOTU25 0.0340.083
 MOTU29 0.0100.083
 MOTU30 0.0110.083
 MOTU33 0.0040.083
 MOTU36 0.0340.333
 MOTU34 0.1820.417
 ,sep=,header=TRUE,stringsAsFactors=FALSE)
 str(dat1)

   dm1  -  melt(dat1, id = place,
   variable.name=type, value.name=freq)
   str(dm1)

 # plot first alternative
   ggplot(dm1, aes(place, freq, colour = type, group = type )) +
 geom_line(group = 1) +
 facet_grid(type ~ . )
   # or plot second alternative.
   ggplot(dm1, aes(place, freq, colour = type, group = type )) +
 geom_line(group = 1) +
 facet_grid(. ~ type )

   ##end code===#


  -Original Message-
  From: gian.benu...@gmail.com
  Sent: Thu, 14 Mar 2013 15:40:53 +0100
  To: r-help@r-project.org
  Subject: Re: [R] column and line graphs in R
 
  Hi again,
 
  Thank you all for your support. I would love to have a graph in which two
  variables are contemporary showed. For example a histogram and a curve
  should be the perfect choice. I tried to use twoord.plot() but I am not
  sure I understand how to manage the the arguments lx, ly, rx, ry...
  Anyway
  these are my data:
 
  nat_af
 rel.abund rel.freq
  MOTU2  0.0030.083
  MOTU4  0.0290.167
  MOTU6  0.0330.167
  MOTU7  0.0230.083
  MOTU9  0.0090.083
  MOTU11 0.0420.250
  MOTU14 0.0690.083
  MOTU16 0.0590.167
  MOTU17 0.0340.083
  MOTU18 0.0490.083
  MOTU19 0.0840.333
  MOTU20 0.0150.083
  MOTU21 0.0590.083
  MOTU22 0.0320.167
  MOTU23 0.1420.250
  MOTU24 0.0310.083
  MOTU25 0.0340.083
  MOTU29 0.0100.083
  MOTU30 0.0110.083
  MOTU33 0.0040.083
  MOTU36 0.0340.333
  MOTU34 0.1820.417
 
  First column is the relative abundance of the given MOTU and second
  column
  is the relative frequency of the same MOTU.
  Thank you very much in advance,
 
  --
  Gian
 
 
  On 14 March 2013 14:51, John Kane jrkrid...@inbox.com wrote:
 
 
 
 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
 
  You really need to read the  posting guide and supply some sample data
  at
  the very least.
 
  Here is about as simple minded a plot as R will do as an example however
 
  dat1  -   structure(list(abond = c(17L, 3L, 6L, 11L, 5L, 8L, 13L, 16L,
 15L, 2L), freq = c(17L, 14L, 7L, 13L, 19L, 5L, 3L, 20L,
  9L,
  10L
   )), .Names = c(abond, freq), row.names = c(NA, -10L),
 class = data.frame)
 
 
plot(dat1$abond, col = red)
lines(dat1$freq, col= blue)
  John Kane
  Kingston ON Canada
 
 
  -Original Message-
  From: gian.benu...@gmail.com
  Sent: Thu, 14 Mar 2013 11:05:40 +0100
  To: r-help@r-project.org
  Subject: [R] column and line graphs in R
 
  Hi all,
 
  I would love to plot my data with R. I have abundance and frequency of
  fungal
  taxonomic data that should be plotted in the same graph. In Microsoft
  Excel
  is that possible but the graphic result is, as always, very poor. Is
  

Re: [R] Difficulty with UNIQUE

2013-03-15 Thread Blaser Nello
with(price.lookup, list(Price_Line)) is a list! Use

unique(unlist(with(price.lookup, list(Price_Line

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Barry King
Sent: Freitag, 15. März 2013 09:34
To: r-help@r-project.org
Subject: [R] Difficulty with UNIQUE

I need to extract labels from Excel input data to use as dimnames later on.
I can successfully read the Excel data into three matrices:

capacity - read.csv(c:\\R\\data\\capacity.csv)
price.lookup - read.csv(c:\\R\\data\\price lookup.csv) sales - 
read.csv(c:\\R\\data\\sales.csv)

The values to be used as dimnames are duplicated in the matrices.

For example, I would like to create

dimnames(out.table) [[3]] - c(a, b, c)

by not explicitly entering the first three letters of the alphabet but by 
something like

dimnames(out.table) [[3]] - pl.names

but I cannot generate unique values with

 pl.names - unique(with(price.lookup, list(Price_Line))) pl.names
[[1]]
  [1] a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a c 
c c c c c c  [44] c c c c c c c c c c c c c c c c c c c c c c c c c c c c c b b 
b b b b b b b b b b b b  [87] b b b b b b b b b b b b b b b b b b b b b b
Levels: a b c

Can someone please suggest how I can grab a, b, c from

(with(price.lookup, list(Price_Line))

?

Thank you,


--
__
*Barry E. King, Ph.D.*
Director of Retail Operations
Qualex Consulting Services, Inc.
barry.k...@qlx.com
O: (317)940-5464
M: (317)507-0661
__

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] phyper returning zero

2013-03-15 Thread R. Michael Weylandt
On Fri, Mar 15, 2013 at 8:52 AM, elliott harrison
e.harri...@epistem.co.uk wrote:
 Hi,
 I am attempting to use phyper to test the significance of two overlapping 
 lists. I keep getting a zero and wondered if that was determining 
 non-significance of my overlap or a p-value too small to calculate?

 overlap = 524
 lista = 2784
 totalpop = 54675
 listb = 1296

 phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F)
 [1] 0

If you set log.p = T, you see that the _log_ of the desired value is
-800, so it's likely simply too small to fit in a IEEE double.

In sort, for all and any practical purposes, your p-value is zero.

Cheers,
MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] phyper returning zero

2013-03-15 Thread elliott harrison
Thanks Michael I assumed as much but we know what that did.

Thanks again.

Elliott

-Original Message-
From: R. Michael Weylandt [mailto:michael.weyla...@gmail.com] 
Sent: 15 March 2013 09:29
To: elliott harrison
Cc: r-help@r-project.org
Subject: Re: [R] phyper returning zero

On Fri, Mar 15, 2013 at 8:52 AM, elliott harrison e.harri...@epistem.co.uk 
wrote:
 Hi,
 I am attempting to use phyper to test the significance of two overlapping 
 lists. I keep getting a zero and wondered if that was determining 
 non-significance of my overlap or a p-value too small to calculate?

 overlap = 524
 lista = 2784
 totalpop = 54675
 listb = 1296

 phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F) 
 [1] 0

If you set log.p = T, you see that the _log_ of the desired value is -800, so 
it's likely simply too small to fit in a IEEE double.

In sort, for all and any practical purposes, your p-value is zero.

Cheers,
MW


This message has been scanned for malware by Websense. www.websense.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] phyper returning zero

2013-03-15 Thread Martin Maechler
 eh == elliott harrison e.harri...@epistem.co.uk
 on Fri, 15 Mar 2013 08:52:36 + writes:

eh Hi,
eh I am attempting to use phyper to test the significance
eh of two overlapping lists. I keep getting a zero and
eh wondered if that was determining non-significance of my
eh overlap or a p-value too small to calculate?

well what do you guess?  (:-)

eh overlap = 524
eh lista = 2784
eh totalpop = 54675
eh listb = 1296

eh phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F)
eh [1] 0

Well, just *do* use  log.p=TRUE :

   phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=TRUE)
  [1] -800.0408

so, indeed   P = exp(-800)  which is smaller than the smallest
positive number in double precision,
which by the way is available in R as

  .Machine$double.xmin
 [1] 2.225074e-308

I'm pretty sure that I cannot think of a situation where it is
important to know that the more exact probability is around  
10^(-347.45)

   phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, 
   log.p=TRUE) / log(10)
 [1] -347.4533

rather than to know that it is very very very small.
Martin

 
eh If I plug in some different values I get a p-value but since zero is 
actually lower is the overlap significant, or more likely have I made a mistake 
in using the function?
eh phyper(10, 100, 2, 100,lower.tail = FALSE, log.p=F)
eh [1] 2.582795e-12


eh Thanks
eh Elliott

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add a continuous color ramp legend to a 3d scatter plot

2013-03-15 Thread Marc Girondot

Le 14/03/13 18:15, Zhuoting Wu a écrit :

I have two follow-up questions:

1. If I want to reverse the heat.colors (i.e., from yellow to red 
instead of red to yellow), is there a way to do that?



nbcol - heat.colors(128)
nbcol - nbcol[128:1]


2. I also created this interactive 3d scatter plot as below:

library(rgl)
plot3d(x=x, y=y, z=z, col=nbcol[zcol], box=FALSE)


I have never use such a plot. Sorry

Marc

Is there any way to add the same legend to this 3d plot?

I'm new to R and try to learn it. I'm very grateful for any help!

thanks,
Z





--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2, arrows and polar coordinates

2013-03-15 Thread Pascal Oettli

Dear R users,

The following issue has been already documented, but, if I am not 
mistaken, not yet solved.


This issue appears while trying to plot arrows with geom_segment 
(package ggplot2), with polar coordinates (coord_polar). The direction 
of some arrows is wrong (red rectangle). Please find herewith an example.


Does someone know how to deal with that issue?

Best Regards,
Pascal Oettli


#--
# Example adapted from the help page of geom_segment

library(ggplot2)
library(grid)

d - data.frame(x1=-135.3, x2=-158.3, y1=37.2, y2=45.2)

p - ggplot(seals, aes(x = long, y = lat))

p1 -
  ggplot() +
  coord_cartesian() +
  geom_rect(data=d, mapping=aes(xmin=x1, xmax=x2, ymin=y1, ymax=y2), 
fill=red, color=red, alpha=0.5) +
  geom_segment(data=seals, aes(x = long, y = lat, xend = long + 
delta_long, yend = lat + delta_lat), arrow = arrow(length = unit(0.2,cm)))


p2 -
  ggplot() +
  coord_polar() +
  geom_rect(data=d, mapping=aes(xmin=x1, xmax=x2, ymin=y1, ymax=y2), 
fill=red, color=red, alpha=0.5) +
  geom_segment(data=seals, aes(x = long, y = lat, xend = long + 
delta_long, yend = lat + delta_lat), arrow = arrow(length = unit(0.2,cm)))


grid.newpage()
pushViewport(viewport(layout = grid.layout(3, 2, heights = unit(c(0.5, 
0.5, 5), null
grid.text(Example taken from '?geom_segment', vp = 
viewport(layout.pos.row = 1, layout.pos.col = 1:2))
grid.text(Cartesian coordinates, vp = viewport(layout.pos.row = 2, 
layout.pos.col = 1))
grid.text(Polar coordinates, vp = viewport(layout.pos.row = 2, 
layout.pos.col = 2))

print(p1, vp = viewport(layout.pos.row = 3, layout.pos.col = 1))
print(p2, vp = viewport(layout.pos.row = 3, layout.pos.col = 2))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to list the all products' information of the latest month?

2013-03-15 Thread Tammy Ma
Hi,

I have data frame like this:

Product PriceYear_Month  PE
A 100201012 -2
A 98   201101-3
A 97   201102-2.5
B 110 201101-1
B 100 201102-2
B  90  201103-4


How can I achieve the following result using R:
Product PriceYear_Month  PE
A 97   201102-2.5
B  90  201103-4

in other words, list the all products' information of the latest month?

Thanks for your help.

Kind regards,
Lingyi




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reviewer comment

2013-03-15 Thread Mohamed Lajnef
Could someone explain me this sentence reviewer below in blod underlined,

Authors should try to be more detailed in the description of analyses:
some of the details reported in the Principal components analysis
paragraph (Results) should be moved here.
Because a highly_/*asymmetric distribution could affect Principal
Component Analysis results,  symmetry of distribution should be
tested. Authors should also indicate if outliers were observed and
consequently excluded because they could affect factors*/_

Any help would be greatly appreciated!

Regards
ML

-- 

Mohamed Lajnef,IE INSERM U955 eq 15#
P?le de Psychiatrie#
H?pital CHENEVIER  #
40, rue Mesly  #
94010 CRETEIL Cedex FRANCE #
mohamed.laj...@inserm.fr   #
tel : 01 49 81 32 79   #
Sec : 01 49 81 32 90   #
fax : 01 49 81 30 99   #



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] does R read commands from scripts instantanously or seuqently during processing

2013-03-15 Thread Jannis

Dear all,

thanks, Rolf and Jeff, for your replies. The command below runs under 
Suse Linux. I guess, hoewever, the phenomena I observed would heappen 
under other oprating systems as well. The reason why I asked was that R 
produced some error messages that did not really point me to the 
direction of the edited script file. These errors were usually something 
like:


Error: unexpected symbol in cess finished.

The line in the script which caused this error is:

print(paste(as.character(Sys.time()), ': Process finished.', sep=''))

This line contains valid R code and would normally not produce an error. 
Some testing showed that the error above only happens when I edit the 
code of the script while the script is run. So R probably reads in a 
script submitted that way seuqently directly while executing the 
individual commands. No idea though what happens if i would start the 
script via source inside R itself.



Thanks again for your suggestions
Jannis


On 14.03.2013 22:47, Rolf Turner wrote:

On 03/15/2013 05:13 AM, Jannis wrote:

Dear R community,


when I source a script into R via:


R --slave  scriptname.R


is the whole script file read at once during startup or is each
indivdual line of code read seqnetially during the execution (i.e.
directly before r processes the respective command)? In other words,
can I savely edit the  scriptname.R file even when an active R process
still runs the command above?



Experiment.  Build a toy script with a loop that never terminates. Set
it going.  Edit the script and change the code so that the loop terminates.
See what happens.

[It seems to me that nothing happens, so that you *can* safely edit
the script while
the process runs.  But further experimentation would be advisable.]

 cheers,

 Rolf Turner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating a hyperlink in a csv file

2013-03-15 Thread Brian Smith
Hi,

I was wondering if it is possible to create a hyperlink in a csv file using
R code and some package. For example, in the following code:

links - cbind(rep('Click for Google',3),http://www.google.com;)
write.table(links,'test.csv',sep=',',row.names=F,col.names=F)


the web address should be linked to 'Click for Google'.

many thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to list the all products' information of the latest month?

2013-03-15 Thread jim holtman
Try this:

 x - read.table(text = Product PriceYear_Month  PE
+ A 100201012 -2
+ A 98   201101-3
+ A 97   201102-2.5
+ B 110 201101-1
+ B 100 201102-2
+ B  90  201103-4, header = TRUE, as.is =
TRUE)
 do.call(rbind
+ , lapply(split(x, x$Product), tail, 1)
+ )
  Product Price Year_Month   PE
A   A97 201102 -2.5
B   B90 201103 -4.0



On Fri, Mar 15, 2013 at 5:56 AM, Tammy Ma metal_lical...@live.com wrote:

 Hi,

 I have data frame like this:

 Product PriceYear_Month  PE
 A 100201012 -2
 A 98   201101-3
 A 97   201102-2.5
 B 110 201101-1
 B 100 201102-2
 B  90  201103-4


 How can I achieve the following result using R:
 Product PriceYear_Month  PE
 A 97   201102-2.5
 B  90  201103-4

 in other words, list the all products' information of the latest month?

 Thanks for your help.

 Kind regards,
 Lingyi





 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing a hyperlink to a csv file

2013-03-15 Thread Brian Smith
Hi,

I was wondering if it is possible to create a hyperlink in a csv file using
R code and some package. For example, in the following code:

links - cbind(rep('Click for Google',3),google search address goes here)
## R Mailing list blocks if I put the actual web address here
write.table(links,'test.csv',
sep=',',row.names=F,col.names=F)


the web address should be linked to 'Click for Google'.

many thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to list the all products' information of the latest month?

2013-03-15 Thread Berend Hasselman

On 15-03-2013, at 10:56, Tammy Ma metal_lical...@live.com wrote:

 Hi,
 
 I have data frame like this:
 
 Product PriceYear_Month  PE
 A 100201012 -2
 A 98   201101-3
 A 97   201102-2.5
 B 110 201101-1
 B 100 201102-2
 B  90  201103-4
 
 
 How can I achieve the following result using R:
 Product PriceYear_Month  PE
 A 97   201102-2.5
 B  90  201103-4
 

Another option is to use aggregate like this

aggregate(x, by=list(x$Product), FUN=function(z) tail(z,1))[,-1]

or

aggregate(. ~ Product, data=x, FUN=function(z) tail(z,1))


Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] does R read commands from scripts instantanously or seuqently during processing

2013-03-15 Thread Prof Brian Ripley

On 15/03/2013 10:40, Jannis wrote:

Dear all,

thanks, Rolf and Jeff, for your replies. The command below runs under
Suse Linux. I guess, hoewever, the phenomena I observed would heappen
under other oprating systems as well. The reason why I asked was that R
produced some error messages that did not really point me to the
direction of the edited script file. These errors were usually something
like:

Error: unexpected symbol in cess finished.

The line in the script which caused this error is:

print(paste(as.character(Sys.time()), ': Process finished.', sep=''))

This line contains valid R code and would normally not produce an error.
Some testing showed that the error above only happens when I edit the
code of the script while the script is run. So R probably reads in a
script submitted that way seuqently directly while executing the


If reading from stdin, it does (like any other interpreter): however 
stdin is buffered if re-directed, so the input script is read in blocks 
from a file (the size of the block depending on the OS).



individual commands. No idea though what happens if i would start the
script via source inside R itself.


R is Open Source, and you can read the code of source().  It really 
isn't hard to see that it parses the whole file, then executes the 
parsed expressions one at a time.





Thanks again for your suggestions
Jannis


On 14.03.2013 22:47, Rolf Turner wrote:

On 03/15/2013 05:13 AM, Jannis wrote:

Dear R community,


when I source a script into R via:


R --slave  scriptname.R


is the whole script file read at once during startup or is each
indivdual line of code read seqnetially during the execution (i.e.
directly before r processes the respective command)? In other words,
can I savely edit the  scriptname.R file even when an active R process
still runs the command above?



Experiment.  Build a toy script with a loop that never terminates. Set
it going.  Edit the script and change the code so that the loop
terminates.
See what happens.

[It seems to me that nothing happens, so that you *can* safely edit
the script while
the process runs.  But further experimentation would be advisable.]

 cheers,

 Rolf Turner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] column and line graphs in R

2013-03-15 Thread Jim Lemon

On 03/15/2013 01:40 AM, Gian Maria Niccolò Benucci wrote:

Hi again,

Thank you all for your support. I would love to have a graph in which two
variables are contemporary showed. For example a histogram and a curve
should be the perfect choice. I tried to use twoord.plot() but I am not
sure I understand how to manage the the arguments lx, ly, rx, ry... Anyway
these are my data:


nat_af

rel.abund rel.freq
MOTU2  0.0030.083
MOTU4  0.0290.167
MOTU6  0.0330.167
MOTU7  0.0230.083
MOTU9  0.0090.083
MOTU11 0.0420.250
MOTU14 0.0690.083
MOTU16 0.0590.167
MOTU17 0.0340.083
MOTU18 0.0490.083
MOTU19 0.0840.333
MOTU20 0.0150.083
MOTU21 0.0590.083
MOTU22 0.0320.167
MOTU23 0.1420.250
MOTU24 0.0310.083
MOTU25 0.0340.083
MOTU29 0.0100.083
MOTU30 0.0110.083
MOTU33 0.0040.083
MOTU36 0.0340.333
MOTU34 0.1820.417

First column is the relative abundance of the given MOTU and second column
is the relative frequency of the same MOTU.


Hi Gian,
You can do this in twoord.plot like this (data is named nat_af and the 
first column is labeled label):


twoord.plot(1:22-0.2,nat_af$rel.abund,1:22+0.2,nat_af$rel.freq,
 type=c(bar,bar),lylim=c(0,0.19),rylim=c(0,0.43),halfwidth=0.2,
 main=Abundance and frequency,ylab=Abundance,rylab=Frequency,
 xticklab=rep(,22))
staxlab(1,at=1:22,labels=nat_af$label,cex=0.8,srt=45)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data manipulation

2013-03-15 Thread IOANNA
Hello all, 

 

I would appreciate your thoughts on a seemingly simple problem. I have a
database, where each row represent a single record. I want to aggregate this
database so I use the aggregate command :

 

D-read.csv(C:\\Users\\test.csv)

 

attach(D)

 

by1-factor(Class)

by2-factor(X)

W-aggregate(x=Count,by=list(by1,by2),FUN=sum)

 

The results I get following the form:

 

W

  Group.1 Group.2 x

1   1 0.1 4

2   2 0.1 7

3   3 0.1 1

4   1 0.2 3

5   3 0.2 4

6   3 0.3 4

 

 

However, what I really want is an aggregation which includes the zero
values, i.e.:

 

W

  Group.1 Group.2 x

1   1 0.1 4

2   2 0.1 7

3   3 0.1 1

4   1 0.2 3

2 0.2 0

5   3 0.2 4

10.3 0

20.3 0

6   3 0.3 4

 

 

How can I achieve what I want?

 

Best regards, 

Ioanna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread John Kane
What zero values?  And are they acutall zeros or are the NA's, that is, missing 
values?

The code looks okay but without some sample data it is difficult to know 
exactly what you are doing. 

The easiest way to supply data  is to use the dput() function.  Example with 
your file named testfile: 
dput(testfile) 
Then copy the output and paste into your email.  For large data sets, you can 
just supply a representative sample.  Usually, 
dput(head(testfile, 100)) will be sufficient.   

 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Please supply some sample data. 
 

John Kane
Kingston ON Canada


 -Original Message-
 From: ii54...@msn.com
 Sent: Fri, 15 Mar 2013 12:40:54 +
 To: r-help@r-project.org
 Subject: [R] Data manipulation
 
 Hello all,
 
 
 
 I would appreciate your thoughts on a seemingly simple problem. I have a
 database, where each row represent a single record. I want to aggregate
 this
 database so I use the aggregate command :
 
 
 
 D-read.csv(C:\\Users\\test.csv)
 
 
 
 attach(D)
 
 
 
 by1-factor(Class)
 
 by2-factor(X)
 
 W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
 
 
 
 The results I get following the form:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 5   3 0.2 4
 
 6   3 0.3 4
 
 
 
 
 
 However, what I really want is an aggregation which includes the zero
 values, i.e.:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 2 0.2 0
 
 5   3 0.2 4
 
 10.3 0
 
 20.3 0
 
 6   3 0.3 4
 
 
 
 
 
 How can I achieve what I want?
 
 
 
 Best regards,
 
 Ioanna
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a hyperlink in a csv file

2013-03-15 Thread R. Michael Weylandt
On Fri, Mar 15, 2013 at 10:52 AM, Brian Smith bsmith030...@gmail.com wrote:
 Hi,

 I was wondering if it is possible to create a hyperlink in a csv file using
 R code and some package. For example, in the following code:


A csv file is a plan text file and by definition doesn't have
hyperlinks. If you want a hyperlink, you'll need to export to a
different format or use a reader which will interpret a URL as a
hyperlink automatically.

MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to list the all products' information of the latest month?

2013-03-15 Thread arun
dat1- read.table(text=
Product    Price    Year_Month  PE
A    100    201012    -2
A    98  201101    -3
A    97  201102    -2.5
B    110    201101    -1
B    100    201102    -2
B  90  201103    -4
,sep=,header=TRUE,stringsAsFactors=FALSE)
dat1[as.logical(with(dat1,ave(Year_Month,Product,FUN=function(x) x==max(x,]
#  Product Price Year_Month   PE
#3   A    97 201102 -2.5
#6   B    90 201103 -4.0
A.K.




- Original Message -
From: Tammy Ma metal_lical...@live.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Friday, March 15, 2013 5:56 AM
Subject: [R] How to list the all products' information of the latest month?

Hi,

I have data frame like this:

Product     Price    Year_Month  PE
A                 100        201012         -2
A                 98           201101        -3
A                 97           201102        -2.5
B                 110         201101        -1
B                 100         201102        -2
B                  90          201103        -4


How can I achieve the following result using R:
Product     Price    Year_Month  PE
A                 97           201102        -2.5
B                  90          201103        -4

in other words, list the all products' information of the latest month?

Thanks for your help.

Kind regards,
Lingyi




                          
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reviewer comment

2013-03-15 Thread John Kane
No idea of what sentence.  R-help strips any html and only provides a text 
message so all formatting has been lost.  I think the question is not really an 
R-help question but if you resubmit the post you need to show the sentence in 
question in another way.

John Kane
Kingston ON Canada


 -Original Message-
 From: mohamed.laj...@inserm.fr
 Sent: Fri, 15 Mar 2013 11:26:45 +0100
 To: r-help@r-project.org
 Subject: [R] reviewer comment
 
 Could someone explain me this sentence reviewer below in blod underlined,
 
 Authors should try to be more detailed in the description of analyses:
 some of the details reported in the Principal components analysis
 paragraph (Results) should be moved here.
 Because a highly_/*asymmetric distribution could affect Principal
 Component Analysis results,  symmetry of distribution should be
 tested. Authors should also indicate if outliers were observed and
 consequently excluded because they could affect factors*/_
 
 Any help would be greatly appreciated!
 
 Regards
 ML
 
 --
 
 Mohamed Lajnef,IE INSERM U955 eq 15#
 P?le de Psychiatrie  #
 H?pital CHENEVIER  #
 40, rue Mesly  #
 94010 CRETEIL Cedex FRANCE #
 mohamed.laj...@inserm.fr   #
 tel : 01 49 81 32 79 #
 Sec : 01 49 81 32 90   #
 fax : 01 49 81 30 99   #
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looking for a good tutorial on ff package

2013-03-15 Thread Fritz Zuhl
Hi,
I am looking for a good tutorial on the ff package. Any suggestions?

Also, any other package would anyone recommend for dealing with data that 
extends beyond the RAM would be greatly appreciated.

Thanks,
Fritz Zuhl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing a hyperlink to a csv file

2013-03-15 Thread John Kane
Well you can write it there but it won't do anything until read into some 
software that can interpret it as a url.  A csv file is just plain text.

John Kane
Kingston ON Canada


 -Original Message-
 From: bsmith030...@gmail.com
 Sent: Fri, 15 Mar 2013 07:53:02 -0400
 To: r-help@r-project.org
 Subject: [R] Writing a hyperlink to a csv file
 
 Hi,
 
 I was wondering if it is possible to create a hyperlink in a csv file
 using
 R code and some package. For example, in the following code:
 
 links - cbind(rep('Click for Google',3),google search address goes
 here)
 ## R Mailing list blocks if I put the actual web address here
 write.table(links,'test.csv',
 sep=',',row.names=F,col.names=F)
 
 
 the web address should be linked to 'Click for Google'.
 
 many thanks!
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Send your photos by email in seconds...
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if3
Works in all emails, instant messengers, blogs, forums and social networks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] metafor - multivariate analysis

2013-03-15 Thread Owen, Branwen



Dear Metafor users, I'm conducting a metaanalysis of prevalence of a particular 
behaviour based on someone elses' code. I've been labouring under the 
impression that this:

summary(rma.1-rma(yi,vi,mods=cbind(approxmeanage,interviewmethodcode),data=mal,method=DL,knha=F,weighted=F,intercept=T))

is doing the multivariate analysis that i want, but have read that multivariate 
analysis can't be done in metafor.

this is the output:

Mixed-Effects Model (k = 22; tau^2 estimator: DL)

logLik Deviance AIC BIC
18.7726 -37.5452 -27.5452 -22.0899

tau^2 (estimate of residual amount of heterogeneity): 0.0106
tau (sqrt of the estimate of residual heterogeneity): 0.1031

Test for Residual Heterogeneity:
QE(df = 18) = 1273.9411, p-val  .0001

Test of Moderators (coefficient(s) 2,3,4):
QM(df = 3) = 11.0096, p-val = 0.0117

Model Results:

estimate se zval pval ci.lb ci.ub
intrcpt 0.4014 0.1705 2.3537 0.0186 0.0671 0.7356 *
continent -0.0206 0.0184 -1.1200 0.2627 -0.0568 0.0155
approxmeanage 0.0076 0.0091 0.8354 0.4035 -0.0102 0.0254
interviewmethodcode -0.0892 0.0273 -3.2702 0.0011 -0.1426 -0.0357 **

---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

my questions are
1. what is this line of code?
2. if it isn't multivariate analysis, will i have to use the mvmeta instead.

thanks very much for any help
Branwen
.http://r.789695.n4.nabble.com/metafor-multivariate-analysis-td4661233.html


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread Blaser Nello
Is this what you want to do?

D2 - expand.grid(Class=unique(D$Class), X=unique(D$X))
D2 - merge(D2, D, all=TRUE)
D2$Count[is.na(D2$Count)] - 0

W - aggregate(D2$Count, list(D2$Class, D2$X), sum)
W

Best, 
Nello


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of IOANNA
Sent: Freitag, 15. März 2013 13:41
To: r-help@r-project.org
Subject: [R] Data manipulation

Hello all, 

 

I would appreciate your thoughts on a seemingly simple problem. I have a 
database, where each row represent a single record. I want to aggregate this 
database so I use the aggregate command :

 

D-read.csv(C:\\Users\\test.csv)

 

attach(D)

 

by1-factor(Class)

by2-factor(X)

W-aggregate(x=Count,by=list(by1,by2),FUN=sum)

 

The results I get following the form:

 

W

  Group.1 Group.2 x

1   1 0.1 4

2   2 0.1 7

3   3 0.1 1

4   1 0.2 3

5   3 0.2 4

6   3 0.3 4

 

 

However, what I really want is an aggregation which includes the zero values, 
i.e.:

 

W

  Group.1 Group.2 x

1   1 0.1 4

2   2 0.1 7

3   3 0.1 1

4   1 0.2 3

2 0.2 0

5   3 0.2 4

10.3 0

20.3 0

6   3 0.3 4

 

 

How can I achieve what I want?

 

Best regards, 

Ioanna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Poisson and negbin gamm in mgcv - overdispersion and theta

2013-03-15 Thread Hosia Aino
Dear R users,

I am trying to use gamm from package mgcv to model results from a mesocosm 
experiment.  My model is of type

M1 -  gamm(Resp ~ s(Day, k=8) + s(Day, by=C, k=8) + Flow + offset(LogVol),
data=MyResp,
correlation = corAR1(form= ~ Day|Mesocosm),
family=poisson(link=log))

where the response variable is counts, offset by the log of sample volume.

Unfortunately, the residuals from the model show heteroscedasticity. While 
trying to follow up on this, I have run into following problems:

1) How to estimate the overdispersion parameter from a (Poisson) gamm?
I have not been able to extract residual degrees of freedom from M1.

2) How to manually estimate theta for a negative binomial gamm?
I would like to see if applying a negative binomial distribution with log link 
(model below) would solve the problem. However, negbin in gamm requires a known 
theta...

M2 -  gamm(Resp ~ s(Day, k=8) + s(Day, by=C, k=8) + Flow + offset(LogVol),
data=MyResp,
correlation = corAR1(form= ~ Day|Mesocosm),
family= negbin(THETA, link=log))

3) And finally, can I somehow compare the models M1 and M2? Trying anova(M1,M2) 
gives the message: Error in eval(expr, envir, enclos) : object 'fixed' not 
found (and I am anyway not sure if this is a valid approach between Poisson 
and negbin gamms).

I am most grateful for any help!

Aino

Aino Hosia
Postdoc
Havforskningsinstituttet/Institute of Marine Research
PO Box 1870 Nordnes, N-5817 Bergen, Norway
(Nordnesgaten 50)
Tel: +47 55 23 53 49
E-mail: aino.ho...@imr.nomailto:aino.ho...@imr.no
www.imr.nohttp://www.imr.no/






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread John Kane
Hi IOANNA,
I got  the data but it is missing a value in Count (length 22 vs length 23 in 
the other two variable so I stuck in an extra 1. I hope this is correct.

There also was an attachement called winmail.dat that appears to be some kind 
of MicroSoft Mail note that is pure gibberish to me--I'm on a Linux box.

For some reason in neither posting does your example of the output you want 
come through.  Are you posting in html ?  R-help strips any html so is there a 
change it stripped out a table?

If i do this 
table(Class, X)
 X
Class 0.1 0.2 0.3
1   4   3   0
2   7   0   0
3   1   4   4
I see that you have two combinations of Class and X with no entries. Is this 
what you wanted to show  in W?  If so, it is not immediately apparent how to go 
about this.  

John Kane
Kingston ON Canada


 -Original Message-
 From: ii54...@msn.com
 Sent: Fri, 15 Mar 2013 13:11:48 +
 To: jrkrid...@inbox.com, r-help@r-project.org
 Subject: RE: [R] Data manipulation
 
 
 Hello John,
 
 
 I thought I attached the file. So here we go:
 Class=c(1,1,1,1,  1,1,1,2,2,2,2,2,2,2,3,3,
 3,3,3,3,  3,3,3)
 X=c(0.1,0.1,0.1,  0.1,0.2,0.2,0.2,0.1,0.1,
 0.1,0.1,0.1,0.1,0.1,0.1,0.2,0.2,0.2,0.2,0.3,0.3,0.3,  0.3)
 Count=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
 
 by1-factor(Class)
 by2-factor(X)
 W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
 
 
 
 However, what I want is a table that also include lines for the Group.1
 and
 Group.2 values for which there are no records. In other words something
 like
 this:
 
 
 
 Thanks again. I hope its clearer now.
 Ioanna
 
 
 -Original Message-
 From: John Kane [mailto:jrkrid...@inbox.com]
 Sent: 15 March 2013 12:51
 To: IOANNA; r-help@r-project.org
 Subject: RE: [R] Data manipulation
 
 What zero values?  And are they acutall zeros or are the NA's, that is,
 missing values?
 
 The code looks okay but without some sample data it is difficult to know
 exactly what you are doing.
 
 The easiest way to supply data  is to use the dput() function.  Example
 with
 your file named testfile:
 dput(testfile)
 Then copy the output and paste into your email.  For large data sets, you
 can just supply a representative sample.  Usually,
 dput(head(testfile, 100)) will be sufficient.
 
 
 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducibl
 e-example
 
 Please supply some sample data.
 
 
 John Kane
 Kingston ON Canada
 
 
 -Original Message-
 From: ii54...@msn.com
 Sent: Fri, 15 Mar 2013 12:40:54 +
 To: r-help@r-project.org
 Subject: [R] Data manipulation
 
 Hello all,
 
 
 
 I would appreciate your thoughts on a seemingly simple problem. I have
 a database, where each row represent a single record. I want to
 aggregate this database so I use the aggregate command :
 
 
 
 D-read.csv(C:\\Users\\test.csv)
 
 
 
 attach(D)
 
 
 
 by1-factor(Class)
 
 by2-factor(X)
 
 W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
 
 
 
 The results I get following the form:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 5   3 0.2 4
 
 6   3 0.3 4
 
 
 
 
 
 However, what I really want is an aggregation which includes the zero
 values, i.e.:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 2 0.2 0
 
 5   3 0.2 4
 
 10.3 0
 
 20.3 0
 
 6   3 0.3 4
 
 
 
 
 
 How can I achieve what I want?
 
 
 
 Best regards,
 
 Ioanna
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
 family!
 Visit http://www.inbox.com/photosharing to find out more!


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reviewer comment

2013-03-15 Thread Mohamed Lajnef
Thanks John for your reply.

the  reviewer comment:

asymmetric distribution could affect Principal
Component Analysis results,  symmetry of distribution should be
tested. Authors should also indicate if outliers were observed and
consequently excluded because they could affect factors

My question: what does it mean asymmetry distribution could affect PCA  ? and 
also outliers could affect factors?

sorry for this not R-help question.

Best regards

M



Le 15/03/13 14:05, John Kane a écrit :
 No idea of what sentence.  R-help strips any html and only provides a text 
 message so all formatting has been lost.  I think the question is not really 
 an R-help question but if you resubmit the post you need to show the sentence 
 in question in another way.

 John Kane
 Kingston ON Canada


 -Original Message-
 From: mohamed.laj...@inserm.fr
 Sent: Fri, 15 Mar 2013 11:26:45 +0100
 To: r-help@r-project.org
 Subject: [R] reviewer comment

 Could someone explain me this sentence reviewer below in blod underlined,

 Authors should try to be more detailed in the description of analyses:
 some of the details reported in the Principal components analysis
 paragraph (Results) should be moved here.
 Because a highly_/*asymmetric distribution could affect Principal
 Component Analysis results,  symmetry of distribution should be
 tested. Authors should also indicate if outliers were observed and
 consequently excluded because they could affect factors*/_

 Any help would be greatly appreciated!

 Regards
 ML

 --
 
 Mohamed Lajnef,IE INSERM U955 eq 15#
 P?le de Psychiatrie #
 H?pital CHENEVIER  #
 40, rue Mesly  #
 94010 CRETEIL Cedex FRANCE #
 mohamed.laj...@inserm.fr   #
 tel : 01 49 81 32 79#
 Sec : 01 49 81 32 90   #
 fax : 01 49 81 30 99   #
 


  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
 http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ 
 and most webmails





-- 

Mohamed Lajnef,IE INSERM U955 eq 15#
P™le de Psychiatrie   #
H™pital CHENEVIER  #
40, rue Mesly  #
94010 CRETEIL Cedex FRANCE #
mohamed.laj...@inserm.fr   #
tel : 01 49 81 32 79   #
Sec : 01 49 81 32 90   #
fax : 01 49 81 30 99   #



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread John Kane
Nice. That does look like it. IOANNA?

John Kane
Kingston ON Canada


 -Original Message-
 From: nbla...@ispm.unibe.ch
 Sent: Fri, 15 Mar 2013 14:27:03 +0100
 To: ii54...@msn.com, r-help@r-project.org
 Subject: Re: [R] Data manipulation
 
 Is this what you want to do?
 
 D2 - expand.grid(Class=unique(D$Class), X=unique(D$X))
 D2 - merge(D2, D, all=TRUE)
 D2$Count[is.na(D2$Count)] - 0
 
 W - aggregate(D2$Count, list(D2$Class, D2$X), sum)
 W
 
 Best,
 Nello
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of IOANNA
 Sent: Freitag, 15. März 2013 13:41
 To: r-help@r-project.org
 Subject: [R] Data manipulation
 
 Hello all,
 
 
 
 I would appreciate your thoughts on a seemingly simple problem. I have a
 database, where each row represent a single record. I want to aggregate
 this database so I use the aggregate command :
 
 
 
 D-read.csv(C:\\Users\\test.csv)
 
 
 
 attach(D)
 
 
 
 by1-factor(Class)
 
 by2-factor(X)
 
 W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
 
 
 
 The results I get following the form:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 5   3 0.2 4
 
 6   3 0.3 4
 
 
 
 
 
 However, what I really want is an aggregation which includes the zero
 values, i.e.:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 2 0.2 0
 
 5   3 0.2 4
 
 10.3 0
 
 20.3 0
 
 6   3 0.3 4
 
 
 
 
 
 How can I achieve what I want?
 
 
 
 Best regards,
 
 Ioanna
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about nls

2013-03-15 Thread Prof J C Nash (U30A)
Actually, it likely won't matter where you start. The Gauss-Newton 
direction is nearly always close to 90 degrees from the gradient, as 
seen by turning trace=TRUE in the package nlmrt function nlxb(), which 
does a safeguarded Marquardt calculation. This can be used in place of 
nls(), except you need to put your data in a data frame. It finds a 
solution pretty straightforwardly, though with quite a few iterations 
and function evaluations.


Of course, one may not really want to do any statistics with 4 
observations and 3 parameters, but the problem illustrates the GN vs. 
Marquardt directions.


JN


 sol-nlxb(y ~ exp(a + b*x)+d,start=list(a=0,b=0,d=1), data=mydata, 
trace=T)

formula: y ~ exp(a + b * x) + d
lower:[1] -Inf -Inf -Inf
upper:[1] Inf Inf Inf
...snip...
Data variable  y :[1]  0.8  6.5 20.5 45.9
Data variable  x :[1]  60  80 100 120
Start:lamda: 1e-04  SS= 2291.15  at  a = 0  b = 0  d = 1  1 / 0
gradient projection =  -2191.093  g-delta-angle= 90.47372
Stepsize= 1
lamda: 0.001  SS= 4.408283e+55  at  a = -25.29517  b = 0.74465  d = 
-24.29517  2 / 1

gradient projection =  -2168.709  g-delta-angle= 90.48307
Stepsize= 1
lamda: 0.01  SS= 3.986892e+54  at  a = -24.55223  b = 0.7284461  d = 
-23.55223  3 / 1

gradient projection =  -1991.804  g-delta-angle= 90.58199
Stepsize= 1
lamda: 0.1  SS= 2.439544e+46  at  a = -18.71606  b = 0.6010118  d = 
-17.71606  4 / 1

gradient projection =  -1476.935  g-delta-angle= 92.79733
Stepsize= 1
lamda: 1  SS= 4.114152e+23  at  a = -2.883776  b = 0.2505892  d = 
-1.883776  5 / 1

gradient projection =  -954.5234  g-delta-angle= 91.78881
Stepsize= 1
lamda: 10  SS= 39033042903  at  a = 2.918809  b = 0.07709855  d = 
3.918809  6 / 1

gradient projection =  -264.9953  g-delta-angle= 91.41647
Stepsize= 1
lamda: 4  SS= 571.451  at  a = 1.023367  b = 0.01762421  d = 2.023367 
 7 / 1

gradient projection =  -60.46016  g-delta-angle= 90.96421
Stepsize= 1
lamda: 1.6  SS= 462.3257  at  a = 1.080764  b = 0.0184132  d = 
1.981399  8 / 2

gradient projection =  -56.91866  g-delta-angle= 90.08103
Stepsize= 1
lamda: 0.64  SS= 359.6233  at  a = 1.135265  b = 0.01942354  d = 
0.9995471  9 / 3

gradient projection =  -65.90027  g-delta-angle= 90.04527
Stepsize= 1

... snip ...

lamda: 0.2748779  SS= 0.5742948  at  a = -0.1491842  b = 0.03419761  d = 
-6.196575  31 / 20

gradient projection =  -6.998402e-25  g-delta-angle= 90.07554
Stepsize= 1
lamda: 2.748779  SS= 0.5742948  at  a = -0.1491842  b = 0.03419761  d = 
-6.196575  32 / 20

gradient projection =  -2.76834e-25  g-delta-angle= 90.16973
Stepsize= 1
lamda: 27.48779  SS= 0.5742948  at  a = -0.1491842  b = 0.03419761  d = 
-6.196575  33 / 20

gradient projection =  -4.632864e-26  g-delta-angle= 90.08759
Stepsize= 1
No parameter change



On 13-03-15 07:00 AM, r-help-requ...@r-project.org wrote:

Message: 36
Date: Thu, 14 Mar 2013 11:04:27 -0400
From: Gabor Grothendieckggrothendi...@gmail.com
To: menglaomen...@163.com
Cc: R helpr-help@r-project.org
Subject: Re: [R] question about nls
Message-ID:
cap01urmodfn87qqvtwmatuid0fx0d7lqmfqh4chofm5b2c9...@mail.gmail.com
Content-Type: text/plain; charset=ISO-8859-1

On Thu, Mar 14, 2013 at 5:07 AM, menglaomen...@163.com  wrote:

Hi,all:
I met a problem of nls.

My data:
xy
60 0.8
80 6.5
100 20.5
120 45.9

I want to fit exp curve of data.

My code:

nls(y ~ exp(a + b*x)+d,start=list(a=0,b=0,d=1))

Error in nlsModel(formula, mf, start, wts) :
   singular gradient matrix at initial parameter estimates

I can't find out the reason for the error.
Any suggesions are welcome.


The gradient is singular at your starting value so you will have to
use a better starting value.  If d = 0 then its linear in log(y) so
you can compute a starting value using lm like this:

lm1 - lm(log(y) ~ x, DF)
st - setNames(c(coef(lm1), 0), c(a, b, d))

Also note that you are trying to fit a model with 3 parameters to only
4 data points.

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reviewer comment

2013-03-15 Thread John Kane

   I think this is more a question for something like Cross Validated but you
   may well get a hint or two here.  Unfortunately while I vaguely see what the
   reviewer is getting at I certainly don't know enough to help.



   John Kane
   Kingston ON Canada

   -Original Message-
   From: mohamed.laj...@inserm.fr
   Sent: Fri, 15 Mar 2013 14:38:10 +0100
   To: jrkrid...@inbox.com
   Subject: Re: [R] reviewer comment

   Thanks John for your reply.
   the  reviewer comment:
asymmetric distribution could affect Principal
Component Analysis results,  symmetry of distribution should be
tested. Authors should also indicate if outliers were observed and
consequently excluded because they could affect factors

My question: what does it mean asymmetry distribution could affect PCA  ? and a
lso outliers could affect factors?

sorry for this not R-help question.

Best regards 

M

   Le 15/03/13 14:05, John Kane a écrit :

No idea of what sentence.  R-help strips any html and only provides a text mess
age so all formatting has been lost.  I think the question is not really an R-h
elp question but if you resubmit the post you need to show the sentence in ques
tion in another way.

John Kane
Kingston ON Canada


-Original Message-
From: [1]mohamed.laj...@inserm.fr
Sent: Fri, 15 Mar 2013 11:26:45 +0100
To: [2]r-help@r-project.org
Subject: [R] reviewer comment

Could someone explain me this sentence reviewer below in blod underlined,

Authors should try to be more detailed in the description of analyses:
some of the details reported in the Principal components analysis
paragraph (Results) should be moved here.
Because a highly_/*asymmetric distribution could affect Principal
Component Analysis results,  symmetry of distribution should be
tested. Authors should also indicate if outliers were observed and
consequently excluded because they could affect factors*/_

Any help would be greatly appreciated!

Regards
ML

--

Mohamed Lajnef,IE INSERM U955 eq 15#
P?le de Psychiatrie#
H?pital CHENEVIER  #
40, rue Mesly  #
94010 CRETEIL Cedex FRANCE #
[3]mohamed.laj...@inserm.fr   #
tel : 01 49 81 32 79   #
Sec : 01 49 81 32 90   #
fax : 01 49 81 30 99   #



[[alternative HTML version deleted]]

__
[4]R-help@r-project.org mailing list
[5]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
[6]http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at [7]http://www.inbox.com/sm
ileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most
 webmails



--

Mohamed Lajnef,IE INSERM U955 eq 15#
P™le de Psychiatrie#
H™pital CHENEVIER  #
40, rue Mesly  #
94010 CRETEIL Cedex FRANCE #
[8]mohamed.laj...@inserm.fr   #
tel : 01 49 81 32 79   #
Sec : 01 49 81 32 90   #
fax : 01 49 81 30 99   #

 _

   Free Online Photosharing - Share your photos online with your friends and
   family!
   Visit [9]http://www.inbox.com/photosharing to find out more!

References

   1. mailto:mohamed.laj...@inserm.fr
   2. mailto:r-help@r-project.org
   3. mailto:mohamed.laj...@inserm.fr
   4. mailto:R-help@r-project.org
   5. https://stat.ethz.ch/mailman/listinfo/r-help
   6. http://www.R-project.org/posting-guide.html
   7. http://www.inbox.com/smileys
   8. mailto:mohamed.laj...@inserm.fr
   9. http://www.inbox.com/photosharing
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread IOANNA
Thanks a lot! 

-Original Message-
From: John Kane [mailto:jrkrid...@inbox.com] 
Sent: 15 March 2013 13:41
To: Blaser Nello; IOANNA; r-help@r-project.org
Subject: Re: [R] Data manipulation

Nice. That does look like it. IOANNA?

John Kane
Kingston ON Canada


 -Original Message-
 From: nbla...@ispm.unibe.ch
 Sent: Fri, 15 Mar 2013 14:27:03 +0100
 To: ii54...@msn.com, r-help@r-project.org
 Subject: Re: [R] Data manipulation
 
 Is this what you want to do?
 
 D2 - expand.grid(Class=unique(D$Class), X=unique(D$X))
 D2 - merge(D2, D, all=TRUE)
 D2$Count[is.na(D2$Count)] - 0
 
 W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W
 
 Best,
 Nello
 
 
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org]
 On Behalf Of IOANNA
 Sent: Freitag, 15. März 2013 13:41
 To: r-help@r-project.org
 Subject: [R] Data manipulation
 
 Hello all,
 
 
 
 I would appreciate your thoughts on a seemingly simple problem. I have 
 a database, where each row represent a single record. I want to 
 aggregate this database so I use the aggregate command :
 
 
 
 D-read.csv(C:\\Users\\test.csv)
 
 
 
 attach(D)
 
 
 
 by1-factor(Class)
 
 by2-factor(X)
 
 W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
 
 
 
 The results I get following the form:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 5   3 0.2 4
 
 6   3 0.3 4
 
 
 
 
 
 However, what I really want is an aggregation which includes the zero 
 values, i.e.:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 2 0.2 0
 
 5   3 0.2 4
 
 10.3 0
 
 20.3 0
 
 6   3 0.3 4
 
 
 
 
 
 How can I achieve what I want?
 
 
 
 Best regards,
 
 Ioanna
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 

webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to make the labels of pie chart are not overlapping?

2013-03-15 Thread Tammy Ma
I have the following dataframe:

Productpredicted_MarketShare  Predicted_MS_Percentage
A2.827450e-02 2.8
B4.716403e-06 0.0
C1.741686e-01 17.4
D   1.716303e-04 0.0
...

Because there are so many products, and most of predicted Market share is 
around 0%.
When I make pie chart, the labels of those product with 0% market share are 
overlapping.
How do I make the labels are not overlapping?

Kind regards.
Tammy
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence interval for survfit

2013-03-15 Thread Terry Therneau

The first thing you are missing is the documentation -- try ?survfit.object.
 fit - survfit(Surv(time,status)~1,data)
fit$std.err will contain the standard error of the cumulative hazard or 
-log(survival)

The standard error of the survival curve is approximately S(t) * std(hazard), by the delta 
method.  This is what is printed by the summary function, because it is what user's 
expect, but it has very poor performance for computing confidence intervals.  A much 
better one is exp(-1* confidence interval for the cumulative hazard), which is the 
default.  In fact there are lots of better ones whose relative ranking depends on the 
details of your simulation study.  About the only really consistent result is that 
anything thoughtful beats S(t) +- 1.96 se(S), easily.  The default in R is the one that 
was best in the most recent paper I had read at the time I set the default.  If I were to 
rank them today using an average over all the comparison papers it would be second or 
third, but the good methods are so close that in practical terms it hardly matters.


Terry Therneau

On 03/15/2013 06:00 AM, r-help-requ...@r-project.org wrote:

Hi, I am wondering how the confidence interval for Kaplan-Meier estimator is 
calculated by survfit(). For example,?


  summary(survfit(Surv(time,status)~1,data),times=10)

Call: survfit(formula = Surv(rtime10, rstat10) ~ 1, data = mgi)

?time n.risk n.event survival std.err lower 95% CI upper 95% CI
?? 10 ?? 168? 55??? 0.761? 0.0282??? 0.707??? 0.818


I am trying to reproduce the upper and lower CI by using standard error. As far 
I understand, the default method for survfit() to calculate confidence interval 
is on the log survival scale, so:


upper CI = exp(log(0.761)+qnorm(0.975)*0.0282) = 0.804
lower CI = exp(log(0.761)-qnorm(0.975)*0.0282) = 0.720


they are not the same as the output from survfit().

Am I missing something?

Thanks

John


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about nls

2013-03-15 Thread Gabor Grothendieck
On Fri, Mar 15, 2013 at 9:45 AM, Prof J C Nash (U30A) nas...@uottawa.ca wrote:
 Actually, it likely won't matter where you start. The Gauss-Newton direction
 is nearly always close to 90 degrees from the gradient, as seen by turning
 trace=TRUE in the package nlmrt function nlxb(), which does a safeguarded
 Marquardt calculation. This can be used in place of nls(), except you need
 to put your data in a data frame. It finds a solution pretty
 straightforwardly, though with quite a few iterations and function
 evaluations.


Interesting observation but it does converge in 5 iterations with the
improved starting value whereas it fails due to a singular gradient
with the original starting value.

 Lines - 
+ xy
+ 60 0.8
+ 80 6.5
+ 100 20.5
+ 120 45.9
+ 
 DF - read.table(text = Lines, header = TRUE)

 # original starting value - singular gradient
 nls(y ~ exp(a + b*x)+d,DF,start=list(a=0,b=0,d=1))
Error in nlsModel(formula, mf, start, wts) :
  singular gradient matrix at initial parameter estimates

 # better starting value - converges in 5 iterations
 lm1 - lm(log(y) ~ x, DF)
 st - setNames(c(coef(lm1), 0), c(a, b, d))
 nls(y ~ exp(a + b*x)+d, DF, start=st)
Nonlinear regression model
  model:  y ~ exp(a + b * x) + d
   data:  DF
  a   b   d
-0.1492  0.0342 -6.1966
 residual sum-of-squares: 0.5743

Number of iterations to convergence: 5
Achieved convergence tolerance: 6.458e-07


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grep with wildcards across multiple columns

2013-03-15 Thread Bush, Daniel P. DPI
I think the way I set up my sample data without any explanation confused things 
slightly. These data might make things clearer:

# Create fake data
df - data.frame(code   = c(rep(1001, 8), rep(1002, 8)),
 year   = rep(c(rep(2011, 4), rep(2012, 4)), 2),
 fund   = rep(c(10E, 27E, 27E, 29E), 4),
 func   = rep(c(11, 122000, 214000, 158000), 4),
 obj= rep(c(100, 100, 210, 220), 4),
 amount = round(rnorm(16, 5, 1)))

These are financial data with a hierarchical account structure where a zero 
represents a summary account that rolls up all the accounts at subsequent 
digits (e.g. 10 rolls up 11, 122000, 158000, etc.). I was trying to do 
two things with the search parameters: turn zeroes into question marks, and 
duplicate the functionality of a SQL query using those question marks as 
wildcards:

# Set parameters
par.fund - 20E; par.func - 10; par.obj - 000
par.fund - glob2rx(gsub(0, ?, par.fund))
par.func - glob2rx(gsub(0, ?, par.func))
par.obj - glob2rx(gsub(0, ?, par.obj))

Fortunately, Bill's suggestion to use the intersect function worked just 
fine--since intersect accepts only two arguments, I had to nest a pair of 
statements:

# Solution: Use a pair of nested intersects
dt2 - dt[intersect(intersect(grep(par.fund, fund), grep(par.func, func)),
grep(par.obj, obj)),
  sum(amount), by=c('code', 'year')]
df2 - ddply(df[intersect(intersect(grep(par.fund, df$fund),
grep(par.func, df$func)),
  grep(par.obj, df$obj)), ],
 .(code, year), summarize, amount = sum(amount))

Thanks for your ideas!

DB

Daniel Bush | School Finance Consultant 
School Financial Services | Wis. Dept. of Public Instruction 
daniel.bush -at- dpi.wi.gov | 608-267-9212

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Thursday, March 14, 2013 5:49 PM
To: Bush, Daniel P. DPI; 'r-help@r-project.org'
Subject: RE: Grep with wildcards across multiple columns

grep(pattern, textVector) returns of the integer indices of the elements of 
textVector that match the pattern.  E.g.,
   grep(T, c(One,Two,Three,Four))
  [1] 2 3

The '' operator naturally operates on logical vectors of the same length (If 
you give it numbers it silently converts 0 to FALSE and  other numbers to TRUE.)

The two don't fit together.  You could use grepl(), which returns a logical 
vector the length of textVector, as in
   grepl(p1,v1)  grepl(p2,v2)
to figure which entries in the table have v1 matching p1 and v2 matching p2.

Or, you could use
  intersect(grep(p1,v1), grep(p2,v2))
if you want to stick with integer indices.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Bush, Daniel P. DPI
 Sent: Thursday, March 14, 2013 2:43 PM
 To: 'r-help@r-project.org'
 Subject: [R] Grep with wildcards across multiple columns
 
 I have a fairly large data set with six variables set up like the following 
 dummy:
 
 # Create fake data
 df - data.frame(code   = c(rep(1001, 8), rep(1002, 8)),
  year   = rep(c(rep(2011, 4), rep(2012, 4)), 2),
  fund   = rep(c(10E, 10E, 10E, 27E), 4),
  func   = rep(c(11, 122000, 214000, 158000), 4),
  obj= rep(100, 16),
  amount = round(rnorm(16, 5, 1)))
 
 What I would like to do is sum the amount variable by code and year, 
 filtering rows using different wildcard searches in each of three 
 columns: 1?E in fund, 1?? in func, and ??? in obj. I'm OK turning 
 these into regular expressions:
 
 # Set parameters
 par.fund - 10E; par.func - 10; par.obj - 000
 par.fund - glob2rx(gsub(0, ?, par.fund)) par.func - 
 glob2rx(gsub(0, ?, par.func)) par.obj - glob2rx(gsub(0, ?, 
 par.obj))
 
 The problem occurs when I try to apply multiple greps across columns. 
 I'd prefer to use data.table since it's so much faster than plyr and I 
 have 159 different sets of parameters to run through, but I get the same 
 error setting it up either way:
 
 # Doesn't work
 library(data.table)
 dt - data.table(df)
 eval(parse(text=paste(
   dt2 - dt[, grep(', par.fund, ', fund)  ,
   grep(', par.func, ', func)  grep(', par.obj, ', obj),
   , sum(amount), by=c('code', 'year')] , sep=))) # Warning 
 message:
 #   In grep(^1.E$, fund)  grep(^1.$, func) :
 #   longer object length is not a multiple of shorter object length
 
 # Also doesn't work
 library(plyr)
 eval(parse(text=paste(
   df2 - ddply(df[grep(', par.fund, ', df$fund)  ,
   grep(', par.func, ', df$func)  grep(', par.obj, ', df$obj), ],
   , .(code, year), summarize, amount = sum(amount)) , sep=))) # 
 Warning message:
 #   In grep(^1.E$, df$fund)  grep(^1.$, df$func) :
 #   longer object length is not a multiple of shorter object length
 
 Clearly, the 

Re: [R] Reassign values based on multiple conditions

2013-03-15 Thread John Kane
I don't see how the data in the three column table you present is enough to 
produce the four column test.  Should the first table actually show repeated 
collar usage so that you can use the next incidence of the collar as the end 
date e.g
1 01/01/2013 
1 02/04/2013 

and so on?

Some actual data might be useful.  
 The easiest way to supply data  is to use the dput() function.  Example with 
your file named testfile: 
dput(testfile) 
Then copy the output and paste into your email.  For large data sets, you can 
just supply a representative sample.  Usually,  dput(head(testfile, 100)) will 
be sufficient.  

Sorry I'm not more helpful


John Kane
Kingston ON Canada


 -Original Message-
 From: cat.e.co...@gmail.com
 Sent: Fri, 15 Mar 2013 12:46:13 +0800
 To: r-help@r-project.org
 Subject: [R] Reassign values based on multiple conditions
 
 Hi all,
 
 I have a simple data frame of three columns - one of numbers (really a
 categorical variable), one of dates and one of data.
 
 Imagine:
 
 collar date data
 1 01/01/2013 x
 2 02/01/2013 y
 3 04/01/2013 z
 4 04/01/2013 a
 5 07/01/2013 b
 
 
 The 'collar' is a GPS collar that's been worn by an animal for a certain
 amount of time, and then may have been worn by a different animal after
 changes when the batteries needed to be changed. When an animal was
 caught
 and the collar battery needed to be changed, a whole new collar had to be
 put on, as these animals (wild boar and red deer!) were not that easy to
 catch. In order to follow the movements of each animal I now need to
 create
 a new column that assigns the 'data' by animal rather than by collar. I
 have a table of dates, e.g
 
 animal collar   start_dateend_date
  1  1  01/01/2013   03/01/2013
  1  5  04/01/2013   06/01/2013
  1  3  07/01/2013   09/01/2013
  2  2  01/01/2013   03/01/2013
  2  1  04/01/2013   06/01/2013
 
 I have so far been able to make multi-conditional tests:
 
 animal1test- (date=01/01/13  date=03/01/13)
 animal1test2- (date=04/01/13  date=06/01/13)
 animal2test- (date=04/01/13  date=06/01/13)
 
 to use in an 'if else' formula:
 
  if(animal1test){
   collar[1]=animal1
   } else if(animal1test2){
 collar[5]=animal1
   }else if(animal2test)
 collar[1]=animal2
 }else NA
 
 As I'm sure you can see, this is completely inelegant, and also not
 working
 for me! Any ideas on how to a achieve this?
 
 Thanks SO much in advance,
 Cat
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reviewer comment

2013-03-15 Thread S Ellison
 

 My question: what does it mean asymmetry distribution could 
 affect PCA  ? and also outliers could affect factors?

It means what it says. PCA will be affected by asymmetry  and outliers will 
affect the principal components (sometimes loosely called 'factors') In 
particular an extreme outlying data point can cause at least one PC to be 
essentially parallel to the vector between the outlier and the mean of the rest 
of the data. If you want a picture of factors describing the bulk of the data 
set, you need to chuck out the extreme points or use robust PCA.

Asymmetry I'd worry less about, at least for exploratory graphical 
presentation; if I had a nice spherical data set I'd probably not be very 
interested in the PCA because it'd not have much discriminatory power for 
groups. But inference based on things like mahalanobis distance often  relies 
on some sense of multivariate normality or the like, and if the model used for 
inference isn't built on a symmetric data set the inferences can be badly 
wrong. Think Turkish flag; the star is 'obviously' not part of the crescent, 
but in mahalanobis distance it's not much further from the (empty) centre of 
the crescent than most of the crescent is. 


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combinations

2013-03-15 Thread Amir

Hi every one,

I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}.
How can I find the sets as follow:

(c1,k1), (c1,k2) ...(c1,kn)  (c2,k1) (c2,k2)  (c2,kn) ... (cn,kn)

Thanks.
Amir

--
__
 Amir Darehshoorzadeh |  Computer Engineering Department
 PostDoc Fellow   |  University of Ottawa, PARADISE LAb
 Email: adare...@uottawa.ca   |  800 King Edward Ave
 Tel: -   |  ON K1N 6N5, Ottawa - CANADA
 http://personals.ac.upc.edu/amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread IOANNA

Hello John, 


I thought I attached the file. So here we go: 
Class=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,
3,3,3,3,3,3,3)
X=c(0.1,0.1,0.1,0.1,0.2,0.2,0.2,0.1,0.1,
0.1,0.1,0.1,0.1,0.1,0.1,0.2,0.2,0.2,0.2,0.3,0.3,0.3,0.3)
Count=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)

by1-factor(Class)
by2-factor(X)
W-aggregate(x=Count,by=list(by1,by2),FUN=sum)

 

However, what I want is a table that also include lines for the Group.1 and
Group.2 values for which there are no records. In other words something like
this:

 

Thanks again. I hope its clearer now. 
Ioanna


-Original Message-
From: John Kane [mailto:jrkrid...@inbox.com] 
Sent: 15 March 2013 12:51
To: IOANNA; r-help@r-project.org
Subject: RE: [R] Data manipulation

What zero values?  And are they acutall zeros or are the NA's, that is,
missing values?

The code looks okay but without some sample data it is difficult to know
exactly what you are doing. 

The easiest way to supply data  is to use the dput() function.  Example with
your file named testfile: 
dput(testfile)
Then copy the output and paste into your email.  For large data sets, you
can just supply a representative sample.  Usually, 
dput(head(testfile, 100)) will be sufficient.   

 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducibl
e-example

Please supply some sample data. 
 

John Kane
Kingston ON Canada


 -Original Message-
 From: ii54...@msn.com
 Sent: Fri, 15 Mar 2013 12:40:54 +
 To: r-help@r-project.org
 Subject: [R] Data manipulation
 
 Hello all,
 
 
 
 I would appreciate your thoughts on a seemingly simple problem. I have 
 a database, where each row represent a single record. I want to 
 aggregate this database so I use the aggregate command :
 
 
 
 D-read.csv(C:\\Users\\test.csv)
 
 
 
 attach(D)
 
 
 
 by1-factor(Class)
 
 by2-factor(X)
 
 W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
 
 
 
 The results I get following the form:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 5   3 0.2 4
 
 6   3 0.3 4
 
 
 
 
 
 However, what I really want is an aggregation which includes the zero 
 values, i.e.:
 
 
 
 W
 
   Group.1 Group.2 x
 
 1   1 0.1 4
 
 2   2 0.1 7
 
 3   3 0.1 1
 
 4   1 0.2 3
 
 2 0.2 0
 
 5   3 0.2 4
 
 10.3 0
 
 20.3 0
 
 6   3 0.3 4
 
 
 
 
 
 How can I achieve what I want?
 
 
 
 Best regards,
 
 Ioanna
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



family!
[[elided Hotmail spam]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread David L Carlson
Wouldn't this do the same thing?

xtabs(Count~Class+X, D)

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of IOANNA
 Sent: Friday, March 15, 2013 8:51 AM
 To: 'John Kane'; 'Blaser Nello'; r-help@r-project.org
 Subject: Re: [R] Data manipulation
 
 Thanks a lot!
 
 -Original Message-
 From: John Kane [mailto:jrkrid...@inbox.com]
 Sent: 15 March 2013 13:41
 To: Blaser Nello; IOANNA; r-help@r-project.org
 Subject: Re: [R] Data manipulation
 
 Nice. That does look like it. IOANNA?
 
 John Kane
 Kingston ON Canada
 
 
  -Original Message-
  From: nbla...@ispm.unibe.ch
  Sent: Fri, 15 Mar 2013 14:27:03 +0100
  To: ii54...@msn.com, r-help@r-project.org
  Subject: Re: [R] Data manipulation
 
  Is this what you want to do?
 
  D2 - expand.grid(Class=unique(D$Class), X=unique(D$X))
  D2 - merge(D2, D, all=TRUE)
  D2$Count[is.na(D2$Count)] - 0
 
  W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W
 
  Best,
  Nello
 
 
  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org]
  On Behalf Of IOANNA
  Sent: Freitag, 15. März 2013 13:41
  To: r-help@r-project.org
  Subject: [R] Data manipulation
 
  Hello all,
 
 
 
  I would appreciate your thoughts on a seemingly simple problem. I
 have
  a database, where each row represent a single record. I want to
  aggregate this database so I use the aggregate command :
 
 
 
  D-read.csv(C:\\Users\\test.csv)
 
 
 
  attach(D)
 
 
 
  by1-factor(Class)
 
  by2-factor(X)
 
  W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
 
 
 
  The results I get following the form:
 
 
 
  W
 
Group.1 Group.2 x
 
  1   1 0.1 4
 
  2   2 0.1 7
 
  3   3 0.1 1
 
  4   1 0.2 3
 
  5   3 0.2 4
 
  6   3 0.3 4
 
 
 
 
 
  However, what I really want is an aggregation which includes the zero
  values, i.e.:
 
 
 
  W
 
Group.1 Group.2 x
 
  1   1 0.1 4
 
  2   2 0.1 7
 
  3   3 0.1 1
 
  4   1 0.2 3
 
  2 0.2 0
 
  5   3 0.2 4
 
  10.3 0
 
  20.3 0
 
  6   3 0.3 4
 
 
 
 
 
  How can I achieve what I want?
 
 
 
  Best regards,
 
  Ioanna
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 
 webmails
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grep with wildcards across multiple columns

2013-03-15 Thread arun


Hi,

You could try this for multiple intersect:

 dt[Reduce(function(...) intersect(...), 
list(grep(par.fund,fund),grep(par.func,func),grep(par.obj,obj))),sum(amount),by=c('code','year')]
#   code year V1
#1: 1001 2011 123528
#2: 1001 2012  97362
#3: 1002 2011 103811
#4: 1002 2012  97179
 dt[intersect(intersect(grep(par.fund, fund), grep(par.func, func)),
 grep(par.obj, obj)),
   sum(amount), by=c('code', 'year')]
 #  code year V1
#1: 1001 2011 123528
#2: 1001 2012  97362
#3: 1002 2011 103811
#4: 1002 2012  97179
A.K.



- Original Message -
From: Bush,  Daniel P.   DPI daniel.b...@dpi.wi.gov
To: 'r-help@r-project.org' r-help@r-project.org
Cc: 'William Dunlap' wdun...@tibco.com; 'smartpink...@yahoo.com' 
smartpink...@yahoo.com; 'djmu...@gmail.com' djmu...@gmail.com
Sent: Friday, March 15, 2013 10:06 AM
Subject: RE: Grep with wildcards across multiple columns

I think the way I set up my sample data without any explanation confused things 
slightly. These data might make things clearer:

# Create fake data
df - data.frame(code   = c(rep(1001, 8), rep(1002, 8)),
                 year   = rep(c(rep(2011, 4), rep(2012, 4)), 2),
                 fund   = rep(c(10E, 27E, 27E, 29E), 4),
                 func   = rep(c(11, 122000, 214000, 158000), 4),
                 obj    = rep(c(100, 100, 210, 220), 4),
                 amount = round(rnorm(16, 5, 1)))

These are financial data with a hierarchical account structure where a zero 
represents a summary account that rolls up all the accounts at subsequent 
digits (e.g. 10 rolls up 11, 122000, 158000, etc.). I was trying to do 
two things with the search parameters: turn zeroes into question marks, and 
duplicate the functionality of a SQL query using those question marks as 
wildcards:

# Set parameters
par.fund - 20E; par.func - 10; par.obj - 000
par.fund - glob2rx(gsub(0, ?, par.fund))
par.func - glob2rx(gsub(0, ?, par.func))
par.obj - glob2rx(gsub(0, ?, par.obj))

Fortunately, Bill's suggestion to use the intersect function worked just 
fine--since intersect accepts only two arguments, I had to nest a pair of 
statements:

# Solution: Use a pair of nested intersects
dt2 - dt[intersect(intersect(grep(par.fund, fund), grep(par.func, func)),
                    grep(par.obj, obj)),
          sum(amount), by=c('code', 'year')]
df2 - ddply(df[intersect(intersect(grep(par.fund, df$fund),
                                    grep(par.func, df$func)),
                          grep(par.obj, df$obj)), ],
             .(code, year), summarize, amount = sum(amount))

Thanks for your ideas!

DB

Daniel Bush | School Finance Consultant 
School Financial Services | Wis. Dept. of Public Instruction 
daniel.bush -at- dpi.wi.gov | 608-267-9212

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Thursday, March 14, 2013 5:49 PM
To: Bush, Daniel P. DPI; 'r-help@r-project.org'
Subject: RE: Grep with wildcards across multiple columns

grep(pattern, textVector) returns of the integer indices of the elements of 
textVector that match the pattern.  E.g.,
   grep(T, c(One,Two,Three,Four))
  [1] 2 3

The '' operator naturally operates on logical vectors of the same length (If 
you give it numbers it silently converts 0 to FALSE and  other numbers to TRUE.)

The two don't fit together.  You could use grepl(), which returns a logical 
vector the length of textVector, as in
   grepl(p1,v1)  grepl(p2,v2)
to figure which entries in the table have v1 matching p1 and v2 matching p2.

Or, you could use
  intersect(grep(p1,v1), grep(p2,v2))
if you want to stick with integer indices.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Bush, Daniel P. DPI
 Sent: Thursday, March 14, 2013 2:43 PM
 To: 'r-help@r-project.org'
 Subject: [R] Grep with wildcards across multiple columns
 
 I have a fairly large data set with six variables set up like the following 
 dummy:
 
 # Create fake data
 df - data.frame(code   = c(rep(1001, 8), rep(1002, 8)),
                  year   = rep(c(rep(2011, 4), rep(2012, 4)), 2),
                  fund   = rep(c(10E, 10E, 10E, 27E), 4),
                  func   = rep(c(11, 122000, 214000, 158000), 4),
                  obj    = rep(100, 16),
                  amount = round(rnorm(16, 5, 1)))
 
 What I would like to do is sum the amount variable by code and year, 
 filtering rows using different wildcard searches in each of three 
 columns: 1?E in fund, 1?? in func, and ??? in obj. I'm OK turning 
 these into regular expressions:
 
 # Set parameters
 par.fund - 10E; par.func - 10; par.obj - 000
 par.fund - glob2rx(gsub(0, ?, par.fund)) par.func - 
 glob2rx(gsub(0, ?, par.func)) par.obj - glob2rx(gsub(0, ?, 
 par.obj))
 
 The problem occurs when I try to apply multiple greps across columns. 
 I'd prefer to use data.table since it's 

Re: [R] question about nls

2013-03-15 Thread Prof J C Nash (U30A)
As Gabor indicates, using a start based on a good approximation is 
usually helpful, and nls() will generally find solutions to problems 
where there are such starts, hence the SelfStart methods. The Marquardt 
approaches are more of a pit-bull approach to the original 
specification. They grind away at the problem without much finesse, but 
generally get there eventually. If one is solving lots of problems of a 
similar type, good starts are the way to go. One-off (or being lazy), I 
like Marquardt.


It would be interesting to know what proportion of random starting 
points in some reasonable bounding box get the singular gradient 
message or other early termination with nls() vs. a Marquardt approach, 
especially as this is a tiny problem. This is just one example of the 
issue R developers face in balancing performance and robustness. The GN 
method in nls() is almost always a good deal more efficient than 
Marquardt approaches when it works, but suffers from a fairly high 
failure rate.



JN


On 13-03-15 10:01 AM, Gabor Grothendieck wrote:

On Fri, Mar 15, 2013 at 9:45 AM, Prof J C Nash (U30A) nas...@uottawa.ca wrote:

Actually, it likely won't matter where you start. The Gauss-Newton direction
is nearly always close to 90 degrees from the gradient, as seen by turning
trace=TRUE in the package nlmrt function nlxb(), which does a safeguarded
Marquardt calculation. This can be used in place of nls(), except you need
to put your data in a data frame. It finds a solution pretty
straightforwardly, though with quite a few iterations and function
evaluations.



Interesting observation but it does converge in 5 iterations with the
improved starting value whereas it fails due to a singular gradient
with the original starting value.


Lines - 

+ xy
+ 60 0.8
+ 80 6.5
+ 100 20.5
+ 120 45.9
+ 

DF - read.table(text = Lines, header = TRUE)

# original starting value - singular gradient
nls(y ~ exp(a + b*x)+d,DF,start=list(a=0,b=0,d=1))

Error in nlsModel(formula, mf, start, wts) :
   singular gradient matrix at initial parameter estimates


# better starting value - converges in 5 iterations
lm1 - lm(log(y) ~ x, DF)
st - setNames(c(coef(lm1), 0), c(a, b, d))
nls(y ~ exp(a + b*x)+d, DF, start=st)

Nonlinear regression model
   model:  y ~ exp(a + b * x) + d
data:  DF
   a   b   d
-0.1492  0.0342 -6.1966
  residual sum-of-squares: 0.5743

Number of iterations to convergence: 5
Achieved convergence tolerance: 6.458e-07




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations

2013-03-15 Thread arun
HI,
Try this:
T1- paste0(c,1:5)
 T2- paste0(k,1:5)

  as.vector(outer(T1,T2,paste,sep=,))
# [1] c1,k1 c2,k1 c3,k1 c4,k1 c5,k1 c1,k2 c2,k2 c3,k2 c4,k2
#[10] c5,k2 c1,k3 c2,k3 c3,k3 c4,k3 c5,k3 c1,k4 c2,k4 c3,k4
#[19] c4,k4 c5,k4 c1,k5 c2,k5 c3,k5 c4,k5 c5,k5
#or
 paste((,as.vector(outer(T1,T2,paste,sep=,)),),sep=)
 #[1] (c1,k1) (c2,k1) (c3,k1) (c4,k1) (c5,k1) (c1,k2) (c2,k2)
 #[8] (c3,k2) (c4,k2) (c5,k2) (c1,k3) (c2,k3) (c3,k3) (c4,k3)
#[15] (c5,k3) (c1,k4) (c2,k4) (c3,k4) (c4,k4) (c5,k4) (c1,k5)
#[22] (c2,k5) (c3,k5) (c4,k5) (c5,k5)
A.K.





- Original Message -
From: Amir adare...@uottawa.ca
To: r-help@r-project.org
Cc: 
Sent: Friday, March 15, 2013 9:22 AM
Subject: [R] Combinations

Hi every one,

I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}.
How can I find the sets as follow:

(c1,k1), (c1,k2) ...(c1,kn)  (c2,k1) (c2,k2)  (c2,kn) ... (cn,kn)

Thanks.
Amir

-- __
Amir Darehshoorzadeh             |  Computer Engineering Department
PostDoc Fellow                   |  University of Ottawa, PARADISE LAb
Email: adare...@uottawa.ca       |  800 King Edward Ave
Tel: -                           |  ON K1N 6N5, Ottawa - CANADA
http://personals.ac.upc.edu/amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations

2013-03-15 Thread Neal H. Walfield
At Fri, 15 Mar 2013 09:22:15 -0400,
Amir wrote:
 I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}.
 How can I find the sets as follow:
 
 (c1,k1), (c1,k2) ...(c1,kn)  (c2,k1) (c2,k2)  (c2,kn) ... (cn,kn)

I think you are looking for expand.grid:

expand.grid(1:3, 10:13)
   Var1 Var2
1 1   10
2 2   10
3 3   10
4 1   11
...

Neal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations

2013-03-15 Thread Rainer Schuermann
Is 

?expand.grid

what you are looking for?

Rgds,
Rainer



On Friday 15 March 2013 09:22:15 Amir wrote:
 Hi every one,
 
 I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}.
 How can I find the sets as follow:
 
 (c1,k1), (c1,k2) ...(c1,kn)  (c2,k1) (c2,k2)  (c2,kn) ... (cn,kn)
 
 Thanks.
 Amir
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation

2013-03-15 Thread David L Carlson
I was too quick on the Send button. Xtabs produces a table. If you want a 
data.frame, it would be data.frame(xtabs(Count~Class+X, D)):

# Match John's summary table and generate Counts
 set.seed(42)
 Count - sample(1:50, 23)
 Class - c(rep(1, 4), rep(2, 7), 3, rep(1, 3), rep(3, 4), rep(3, 4))
 X - c(rep(.1, 12), rep(.2, 7), rep(.3, 4))
 D - data.frame(Class=factor(Class), X=factor(X), Count)
 table(D$Class, D$X)
   
0.1 0.2 0.3
  1   4   3   0
  2   7   0   0
  3   1   4   4

# Create the table/data.frame
 D.table - xtabs(Count~Class+X)
 D.table
 X
Class 0.1 0.2 0.3
1 150  63   0
2 169   0   0
3  41  98 114
 D.df - data.frame(D.table)
 D.df
  Class   X Freq
1 1 0.1  150
2 2 0.1  169
3 3 0.1   41
4 1 0.2   63
5 2 0.20
6 3 0.2   98
7 1 0.30
8 2 0.30
9 3 0.3  114

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of David L Carlson
 Sent: Friday, March 15, 2013 9:23 AM
 To: 'IOANNA'; 'John Kane'; 'Blaser Nello'; r-help@r-project.org
 Subject: Re: [R] Data manipulation
 
 Wouldn't this do the same thing?
 
 xtabs(Count~Class+X, D)
 
 --
 David L Carlson
 Associate Professor of Anthropology
 Texas AM University
 College Station, TX 77843-4352
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of IOANNA
  Sent: Friday, March 15, 2013 8:51 AM
  To: 'John Kane'; 'Blaser Nello'; r-help@r-project.org
  Subject: Re: [R] Data manipulation
 
  Thanks a lot!
 
  -Original Message-
  From: John Kane [mailto:jrkrid...@inbox.com]
  Sent: 15 March 2013 13:41
  To: Blaser Nello; IOANNA; r-help@r-project.org
  Subject: Re: [R] Data manipulation
 
  Nice. That does look like it. IOANNA?
 
  John Kane
  Kingston ON Canada
 
 
   -Original Message-
   From: nbla...@ispm.unibe.ch
   Sent: Fri, 15 Mar 2013 14:27:03 +0100
   To: ii54...@msn.com, r-help@r-project.org
   Subject: Re: [R] Data manipulation
  
   Is this what you want to do?
  
   D2 - expand.grid(Class=unique(D$Class), X=unique(D$X))
   D2 - merge(D2, D, all=TRUE)
   D2$Count[is.na(D2$Count)] - 0
  
   W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W
  
   Best,
   Nello
  
  
   -Original Message-
   From: r-help-boun...@r-project.org
   [mailto:r-help-boun...@r-project.org]
   On Behalf Of IOANNA
   Sent: Freitag, 15. März 2013 13:41
   To: r-help@r-project.org
   Subject: [R] Data manipulation
  
   Hello all,
  
  
  
   I would appreciate your thoughts on a seemingly simple problem. I
  have
   a database, where each row represent a single record. I want to
   aggregate this database so I use the aggregate command :
  
  
  
   D-read.csv(C:\\Users\\test.csv)
  
  
  
   attach(D)
  
  
  
   by1-factor(Class)
  
   by2-factor(X)
  
   W-aggregate(x=Count,by=list(by1,by2),FUN=sum)
  
  
  
   The results I get following the form:
  
  
  
   W
  
 Group.1 Group.2 x
  
   1   1 0.1 4
  
   2   2 0.1 7
  
   3   3 0.1 1
  
   4   1 0.2 3
  
   5   3 0.2 4
  
   6   3 0.3 4
  
  
  
  
  
   However, what I really want is an aggregation which includes the
 zero
   values, i.e.:
  
  
  
   W
  
 Group.1 Group.2 x
  
   1   1 0.1 4
  
   2   2 0.1 7
  
   3   3 0.1 1
  
   4   1 0.2 3
  
   2 0.2 0
  
   5   3 0.2 4
  
   10.3 0
  
   20.3 0
  
   6   3 0.3 4
  
  
  
  
  
   How can I achieve what I want?
  
  
  
   Best regards,
  
   Ioanna
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  
  GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 
  webmails
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and 

Re: [R] metafor - multivariate analysis

2013-03-15 Thread Viechtbauer Wolfgang (STAT)
Dear Owen,

What is your definition of multivariate analysis? Do you mean: A 
meta-regression model with more than one predictor/moderator? In that case, 
yes, metafor handles that. Usually, this is referred to as multiple 
regression (as opposed to simple regression with a single predictor) -- and 
in the case of a meta-analysis, I guess one could call it multiple 
meta-regression. 

If you are referring to a model that handles statistical dependencies in the 
observed outcomes (and hence requires multivariate methods), then you will have 
to use some other package (e.g., mvmeta).

See also:

http://stats.stackexchange.com/questions/2358/explain-the-difference-between-multiple-regression-and-multivariate-regression

Best,
Wolfgang

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Owen, Branwen
 Sent: Friday, March 15, 2013 11:23
 To: r-help@r-project.org
 Subject: [R] metafor - multivariate analysis
 
 
 
 
 Dear Metafor users, I'm conducting a metaanalysis of prevalence of a
 particular behaviour based on someone elses' code. I've been labouring
 under the impression that this:
 
 summary(rma.1-
 rma(yi,vi,mods=cbind(approxmeanage,interviewmethodcode),data=mal,method=D
 L,knha=F,weighted=F,intercept=T))
 
 is doing the multivariate analysis that i want, but have read that
 multivariate analysis can't be done in metafor.
 
 this is the output:
 
 Mixed-Effects Model (k = 22; tau^2 estimator: DL)
 
 logLik Deviance AIC BIC
 18.7726 -37.5452 -27.5452 -22.0899
 
 tau^2 (estimate of residual amount of heterogeneity): 0.0106
 tau (sqrt of the estimate of residual heterogeneity): 0.1031
 
 Test for Residual Heterogeneity:
 QE(df = 18) = 1273.9411, p-val  .0001
 
 Test of Moderators (coefficient(s) 2,3,4):
 QM(df = 3) = 11.0096, p-val = 0.0117
 
 Model Results:
 
 estimate se zval pval ci.lb ci.ub
 intrcpt 0.4014 0.1705 2.3537 0.0186 0.0671 0.7356 *
 continent -0.0206 0.0184 -1.1200 0.2627 -0.0568 0.0155
 approxmeanage 0.0076 0.0091 0.8354 0.4035 -0.0102 0.0254
 interviewmethodcode -0.0892 0.0273 -3.2702 0.0011 -0.1426 -0.0357 **
 
 ---
 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 my questions are
 1. what is this line of code?
 2. if it isn't multivariate analysis, will i have to use the mvmeta
 instead.
 
 thanks very much for any help
 Branwen
 .http://r.789695.n4.nabble.com/metafor-multivariate-analysis-
 td4661233.html
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Normalized 2-D cross-correlation

2013-03-15 Thread Felix Nensa
Hi all,

I need to do (normalized) 2-D cross-correlation in R. There is a convenient
function available in Matlab (see:
http://www.mathworks.de/de/help/images/ref/normxcorr2.html).
Is there anything comparable in R available?

Thanks,

Felix

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reassign values based on multiple conditions

2013-03-15 Thread David L Carlson
This might get you started, but more data is needed to test this

# First create the data 
Collars - data.frame(collar=1:5, date=as.POSIXlt(c(01/01/2013, 02/01/2013,
04/01/2013, 04/01/2013, 07/01/2013), format=%m/%d/%y), 
data=letters[c(24:26, 1:2)])
Animals - data.frame(animal=c(1, 1, 1, 2, 2), collar=c(1, 5, 3, 2, 1),
start_date=as.POSIXlt(c(01/01/2013, 04/01/2013,
07/01/2013, 01/01/2013, 04/01/2013), format=%m/%d/%y),
end_date=as.POSIXlt(c(03/01/2013, 06/01/2013,
09/01/2013, 03/01/2013, 06/01/2013), format=%m/%d/%y))

Now you want to look at each row in Collars and find the number of the animal 
that it represents in Animals:

 AC - rep(NA, nrow(Collars))
 for (i in 1:nrow(Collars)) {
+AC[i] - Animals$animal[Animals$collar==Collars$collar[i]  
+ Animals$start_date=Collars$date[i]  
Animals$end_date=Collars$date[i]]
+ }
Error in AC[i] - Animals$animal[Animals$collar == Collars$collar[i]   : 
  replacement has length zero
 
 AC
[1]  1  2 NA NA NA

AC is the vector of animal numbers that has the same length as the Collars 
data.frame. In this case only the first two rows in Collars match anything in 
Animals so the rest are NA and R prints an error message. 

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of John Kane
 Sent: Friday, March 15, 2013 9:07 AM
 To: Cat Cowie; r-help@r-project.org
 Subject: Re: [R] Reassign values based on multiple conditions
 
 I don't see how the data in the three column table you present is
 enough to produce the four column test.  Should the first table
 actually show repeated collar usage so that you can use the next
 incidence of the collar as the end date e.g
 1 01/01/2013
 1 02/04/2013
 
 and so on?
 
 Some actual data might be useful.
  The easiest way to supply data  is to use the dput() function.
 Example with your file named testfile:
 dput(testfile)
 Then copy the output and paste into your email.  For large data sets,
 you can just supply a representative sample.  Usually,
 dput(head(testfile, 100)) will be sufficient.
 
 Sorry I'm not more helpful
 
 
 John Kane
 Kingston ON Canada
 
 
  -Original Message-
  From: cat.e.co...@gmail.com
  Sent: Fri, 15 Mar 2013 12:46:13 +0800
  To: r-help@r-project.org
  Subject: [R] Reassign values based on multiple conditions
 
  Hi all,
 
  I have a simple data frame of three columns - one of numbers (really
 a
  categorical variable), one of dates and one of data.
 
  Imagine:
 
  collar date data
  1 01/01/2013 x
  2 02/01/2013 y
  3 04/01/2013 z
  4 04/01/2013 a
  5 07/01/2013 b
 
 
  The 'collar' is a GPS collar that's been worn by an animal for a
 certain
  amount of time, and then may have been worn by a different animal
 after
  changes when the batteries needed to be changed. When an animal was
  caught
  and the collar battery needed to be changed, a whole new collar had
 to be
  put on, as these animals (wild boar and red deer!) were not that easy
 to
  catch. In order to follow the movements of each animal I now need to
  create
  a new column that assigns the 'data' by animal rather than by collar.
 I
  have a table of dates, e.g
 
  animal collar   start_dateend_date
   1  1  01/01/2013   03/01/2013
   1  5  04/01/2013   06/01/2013
   1  3  07/01/2013   09/01/2013
   2  2  01/01/2013   03/01/2013
   2  1  04/01/2013   06/01/2013
 
  I have so far been able to make multi-conditional tests:
 
  animal1test- (date=01/01/13  date=03/01/13)
  animal1test2- (date=04/01/13  date=06/01/13)
  animal2test- (date=04/01/13  date=06/01/13)
 
  to use in an 'if else' formula:
 
   if(animal1test){
collar[1]=animal1
} else if(animal1test2){
  collar[5]=animal1
}else if(animal2test)
  collar[1]=animal2
  }else NA
 
  As I'm sure you can see, this is completely inelegant, and also not
  working
  for me! Any ideas on how to a achieve this?
 
  Thanks SO much in advance,
  Cat
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™
 and most webmails
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, 

[R] missing values in an array

2013-03-15 Thread Ray Cheung
Dear All,

I've an array with some missing values (NA) in between. I want to remove
that particular matrix if a missing value is detected. How can I do so?
Thank you very much.

Best regards,
Ray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] missing values in an array

2013-03-15 Thread arun
HI,
Try this:
set.seed(25)
arr1- array(sample(c(1:40,NA),60,replace=TRUE),dim=c(5,4,3))
arr1[,,sapply(seq(dim(arr1)[3]),function(i) all(!is.na(arr1[,,i])))]
# [,1] [,2] [,3] [,4]
#[1,]    2   13   34   17
#[2,]   19    3   15   39
#[3,]    4   25   10   16
#[4,]    7   22    5    7
#[5,]   12   10   35    6

#2nd case
set.seed(46)
arr2- array(sample(c(1:40,NA),60,replace=TRUE),dim=c(5,4,3))
arr2[,,sapply(seq(dim(arr2)[3]),function(i) all(!is.na(arr2[,,i])))]
#, , 1

# [,1] [,2] [,3] [,4]
#[1,]    8   27   11   28
#[2,]   10   37    5   40
#[3,]   24   25   28    6
#[4,]   15   37    3   25
#[5,]   10   39   32   23

#, , 2
#
 [,1] [,2] [,3] [,4]
#[1,]   14    2    8   27
#[2,]   10   39   37    4
#[3,]    9   36   15    6
#[4,]   33   16   20   32
#[5,]   21    6   28   15


A.K.



- Original Message -
From: Ray Cheung ray1...@gmail.com
To: R help r-help@r-project.org
Cc: 
Sent: Friday, March 15, 2013 12:08 PM
Subject: [R] missing values in an array

Dear All,

I've an array with some missing values (NA) in between. I want to remove
that particular matrix if a missing value is detected. How can I do so?
Thank you very much.

Best regards,
Ray

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make the labels of pie chart are not overlapping?

2013-03-15 Thread Bert Gunter
Simple -- don't make a pie chart.

-- Bert

(Seriously -- this is an awful display. Consider, instead, a bar plot
plotting cumulative sums of percentages with products/bars ordered from
largest percentage to smallest; or plotting just the percentages in that
order, depending on which is more informative.)

On Fri, Mar 15, 2013 at 6:58 AM, Tammy Ma metal_lical...@live.com wrote:

 I have the following dataframe:

 Productpredicted_MarketShare  Predicted_MS_Percentage
 A2.827450e-02 2.8
 B4.716403e-06 0.0
 C1.741686e-01 17.4
 D   1.716303e-04 0.0
 ...

 Because there are so many products, and most of predicted Market share is
 around 0%.
 When I make pie chart, the labels of those product with 0% market share
 are overlapping.
 How do I make the labels are not overlapping?

 Kind regards.
 Tammy

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] missing values in an array

2013-03-15 Thread Berend Hasselman

On 15-03-2013, at 17:08, Ray Cheung ray1...@gmail.com wrote:

 Dear All,
 
 I've an array with some missing values (NA) in between. I want to remove
 that particular matrix if a missing value is detected. How can I do so?
 Thank you very much.


It is not clear what the dimension of your array is.

If your array/matrix is two dimensional, then then

any(is.na(A))  # A is the name of the array/matrix

will return TRUE is at least one element of A is NA. And then you can delete A.

If you array has three dimensions then you'll have to look at arun's solution.

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make the labels of pie chart are not overlapping?

2013-03-15 Thread Ista Zahn
On Fri, Mar 15, 2013 at 12:30 PM, Bert Gunter gunter.ber...@gene.com wrote:
 Simple -- don't make a pie chart.

This is great advice. But it you (or your boss) insists on pie charts,
then you should provide us with a reproducible example that
illustrates your problem.

dat - read.table(text=Productpredicted_MarketShare
Predicted_MS_Percentage
A2.827450e-02 2.8
B4.716403e-06 0.0
C1.741686e-01 17.4
D   1.716303e-04 0.0,
  header=TRUE)

pie(dat[[2]], labels=dat[[1]])

Does not give overlapping labels, so I don't yet have an example of
the problem you are trying to solve.

Best,
Ista


 -- Bert

 (Seriously -- this is an awful display. Consider, instead, a bar plot
 plotting cumulative sums of percentages with products/bars ordered from
 largest percentage to smallest; or plotting just the percentages in that
 order, depending on which is more informative.)

 On Fri, Mar 15, 2013 at 6:58 AM, Tammy Ma metal_lical...@live.com wrote:

 I have the following dataframe:

 Productpredicted_MarketShare  Predicted_MS_Percentage
 A2.827450e-02 2.8
 B4.716403e-06 0.0
 C1.741686e-01 17.4
 D   1.716303e-04 0.0
 ...

 Because there are so many products, and most of predicted Market share is
 around 0%.
 When I make pie chart, the labels of those product with 0% market share
 are overlapping.
 How do I make the labels are not overlapping?

 Kind regards.
 Tammy

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help finding first value in a BY group

2013-03-15 Thread Barry King
I have a large Excel file with SKU numbers (stock keeping units) and
forecasts which can be mimicked with the following:

Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2)
SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)
Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201)
PeriodSKUForecast - data.frame(Period, SKU, Forecast)
PeriodSKUForecast

  Period SKU Forecast
1  1  A1   99
2  2  A1  103
3  3  A1  128
4  1  X4   63
5  2  X4   69
6  3  X4   72
7  4  X4   75
8  1  K2  207
9  2  K2  201

I need to create a matrix with only the first forecast for each SKU:

A1 99
X4 63
K2 207

The Period for the first forecast will always be the minimum value
for an SKU.

Can anyone suggest how I might accomplish this?

Thank you,



-- 
__
*Barry E. King, Ph.D.*
Director of Retail Operations
Qualex Consulting Services, Inc.
barry.k...@qlx.com
O: (317)940-5464
M: (317)507-0661
__

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help finding first value in a BY group

2013-03-15 Thread arun
Hi,
Try:
data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)))
#   Forecast
#A1   99
#K2  207
#X4   63

#or
 aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
#  SKU Forecast
#1  A1   99
#2  K2  207
#3  X4   63

#or
library(plyr)
ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1))
#  SKU Forecast
#1  A1   99
#2  K2  207
#3  X4   63
A.K.




- Original Message -
From: Barry King barry.k...@qlx.com
To: r-help@r-project.org
Cc: 
Sent: Friday, March 15, 2013 1:30 PM
Subject: [R] Help finding first value in a BY group

I have a large Excel file with SKU numbers (stock keeping units) and
forecasts which can be mimicked with the following:

Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2)
SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)
Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201)
PeriodSKUForecast - data.frame(Period, SKU, Forecast)
PeriodSKUForecast

  Period SKU Forecast
1      1  A1       99
2      2  A1      103
3      3  A1      128
4      1  X4       63
5      2  X4       69
6      3  X4       72
7      4  X4       75
8      1  K2      207
9      2  K2      201

I need to create a matrix with only the first forecast for each SKU:

A1 99
X4 63
K2 207

The Period for the first forecast will always be the minimum value
for an SKU.

Can anyone suggest how I might accomplish this?

Thank you,



-- 
__
*Barry E. King, Ph.D.*
Director of Retail Operations
Qualex Consulting Services, Inc.
barry.k...@qlx.com
O: (317)940-5464
M: (317)507-0661
__

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple filled.contour() on single page

2013-03-15 Thread Sebastian P. Luque
Hi,

It seems as if filled.contour can't be used along with layout(), or
par(mfrow) or the like, since it sets the page in a very particular
manner.  Someone posted a workaround
(http://r.789695.n4.nabble.com/several-Filled-contour-plots-on-the-same-device-td819040.html).
Has a better approach been developed for achieving this?  Thanks.

Cheers,

-- 
Seb

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple frequencies per second again

2013-03-15 Thread Erin Hodgess
Dear R People:

I have the following situation.  I have observations that are 128 samples
per second, which is fine.  I want to fit them with ARIMA models, also fine.

My question is, please:  when I do my forecasting, do I need to do anything
special to the n.ahead parm, please?  Here is the initial setup:


 xx - ts(rnorm(128),start=0,freq=128)
 str(xx)
 Time-Series [1:128] from 0 to 0.992: -1.07 0.498 1.508 0.354 -0.497 ...
 xx.ar - arima(xx,order=c(1,0,0))
 str(xx.ar)
List of 13
 $ coef : Named num [1:2] -0.0818 0.0662
  ..- attr(*, names)= chr [1:2] ar1 intercept
 $ sigma2   : num 1.06
 $ var.coef : num [1:2, 1:2] 7.78e-03 -5.09e-05 -5.09e-05 7.07e-03
  ..- attr(*, dimnames)=List of 2
  .. ..$ : chr [1:2] ar1 intercept
  .. ..$ : chr [1:2] ar1 intercept
 $ mask : logi [1:2] TRUE TRUE
 $ loglik   : num -185
 $ aic  : num 376
 $ arma : int [1:7] 1 0 0 0 128 0 0
 $ residuals: Time-Series [1:128] from 0 to 0.992: -1.133 0.338 1.477 0.406
-0.54 ...
 $ call : language arima(x = xx, order = c(1, 0, 0))
 $ series   : chr xx
 $ code : int 0
 $ n.cond   : int 0
 $ model:List of 10
  ..$ phi  : num -0.0818
  ..$ theta: num(0)
  ..$ Delta: num(0)
  ..$ Z: num 1
  ..$ a: num 0.156
  ..$ P: num [1, 1] 0
  ..$ T: num [1, 1] -0.0818
  ..$ V: num [1, 1] 1
  ..$ h: num 0
  ..$ Pn   : num [1, 1] 1
 - attr(*, class)= chr Arima
 predict(xx.ar,n.ahead=3)
$pred
Time Series:
Start = c(1, 1)
End = c(1, 3)
Frequency = 128
[1] 0.05346814 0.06728105 0.06615104

$se
Time Series:
Start = c(1, 1)
End = c(1, 3)
Frequency = 128
[1] 1.028302 1.031737 1.031760



Thanks for any help.

Sincerely,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add a continuous color ramp legend to a 3d scatter plot

2013-03-15 Thread Zhuoting Wu
Marc,

Thank you very much for your reply! It helps tremendously!

best,
Z

On Fri, Mar 15, 2013 at 2:37 AM, Marc Girondot marc_...@yahoo.fr wrote:

 Le 14/03/13 18:15, Zhuoting Wu a écrit :

  I have two follow-up questions:

 1. If I want to reverse the heat.colors (i.e., from yellow to red instead
 of red to yellow), is there a way to do that?

  nbcol - heat.colors(128)
 nbcol - nbcol[128:1]


  2. I also created this interactive 3d scatter plot as below:

 library(rgl)
 plot3d(x=x, y=y, z=z, col=nbcol[zcol], box=FALSE)

  I have never use such a plot. Sorry

 Marc

  Is there any way to add the same legend to this 3d plot?

 I'm new to R and try to learn it. I'm very grateful for any help!

 thanks,
 Z




 --
 __**
 Marc Girondot, Pr

 Laboratoire Ecologie, Systématique et Evolution
 Equipe de Conservation des Populations et des Communautés
 CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
 Bātiment 362

 91405 Orsay Cedex, France

 Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
 e-mail: marc.giron...@u-psud.fr
 Web: 
 http://www.ese.u-psud.fr/epc/**conservation/Marc.htmlhttp://www.ese.u-psud.fr/epc/conservation/Marc.html
 Skype: girondot



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] seeking tip to keep first of multiple observations per ID

2013-03-15 Thread Julie Royster
Dear R community,

I am a neophyte and I cannot figure out how to accomplish keeping only the
first record for each ID in a data.frame that has assorted numbers of
records per ID.

I studied and found references to packages plyr and sql for R, and I fear
the documentation for those was over my head and I could not identify what
may be there to reach my goal.

If someone could point me toward a method I will gladly study documentation,
or if there is an example posted someplace I will follow it.

THANKS!
Julie 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nlrob and robust nonlinear regression with upper and/or lower bounds on parameters

2013-03-15 Thread Shane McMahon
I have a question regarding robust nonlinear regression with nlrob. I 
would like to place lower bounds on the parameters, but when I call 
nlrob with limits it returns the following error:


Error in psi(resid/Scale, ...) : unused argument(s) (lower = list(Asym 
= 1, mid = 1, scal = 1))


After consulting the documentation I noticed that upper and lower are 
not listed as parameter in the nlrob help documentation. I haven't 
checked the source to confirm this yet, but I infer that nlrob simply 
doesn't support upper and lower bounds.
For my current problem, I only require that the parameters be positive, 
so I simply rewrote the formula to be a function of the absolute value 
of the parameter. However, I have other problems where I am not so 
lucky. Are there robust nonlinear regression methods that support upper 
and lower bounds? Or am I simply missing something with nlrob? I've 
included example code that should illustrate the issue.


require(stats)
require(robustbase)
Dat - NULL; Dat$x - rep(1:25, 20)
set.seed(1)
Dat$y - SSlogis(Dat$x, 10, 12, 2)*rnorm(500, 1, 0.1)
plot(Dat)
Dat.nls - nls(y ~ SSlogis(x, Asym, mid, scal), 
data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); 
Dat.nls

lines(1:25, predict(Dat.nls, newdata=list(x=1:25)), col=1)
Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), 
data=Dat,start=list(Asym=1,mid=1,scal=1)); Dat.nlrob

lines(1:25, predict(Dat.nlrob, newdata=list(x=1:25)), col=2)
Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), 
data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); 
Dat.nlrob




thanks,
Shane

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Spearman rank correlation

2013-03-15 Thread Zia mel
Hi

If I get a p-value less than 0.05 does that mean there is a
significant relation between the 2 ranked lists? Sometimes I get a low
correlation such as 0.3 or even 0.2 and the p-value is so low , such
as 0.01 , does that mean it is significant also? and would that be
interpreted as significant low positive correlation or significant
moderate positive correlation? Also,can R calculate the results for
lists  30 for spearman or I need to shift to pearson correlation in
that case? and finally what does the (S=) in the results mean?

Many thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merge two matrices

2013-03-15 Thread Michael Eisenring
Dear R-help members
 
I would be grateful if anyone could help me with the following problem: I would 
like to combine two matrices (SCH_15 and SCH_16, they are attached) which have 
a  species presence/absence x sampling plot structure. The aim would be to have 
in the end only one matrix which shows all existing species and their 
presence/absence on all the different plots(an_1, an_2 etc.)
To do this I used the merge function in R.
Command:
output-merge(SCH_15,SCH_16,by=species, all=TRUE)

The problem is that if the same species occurs in both SCH files (i.e. 
species Abutilon longicuspe occurs in both files) it is listed two times in the 
merged matrix. However, the aim would be that each species is listed only once 
in the final matrix.
How do I have to modify the R code? I guess I have to replace all=TRUE with 
something else but I can't figure out what it is.

Thank you for your help
Michael
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help finding first value in a BY group

2013-03-15 Thread Marc Schwartz
Hi,

There is a potential gotcha with the approach of using head(..., 1) in each of 
the solutions that Arun has below, which is the assumption that the data is 
sorted, as is the case in the example data. It seems reasonable to consider 
that the real data at hand may not be entered in order or presorted.

If the data is not sorted (switching the order of the two K2 related entries):

Period - c(1, 2, 3, 1, 2, 3, 4, 2, 1)
Forecast - c(99, 103, 128, 63, 69, 72, 75, 201, 207)
SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)

PeriodSKUForecast - data.frame(Period, SKU, Forecast)

 PeriodSKUForecast
  Period SKU Forecast
1  1  A1   99
2  2  A1  103
3  3  A1  128
4  1  X4   63
5  2  X4   69
6  3  X4   72
7  4  X4   75
8  2  K2  201
9  1  K2  207


 with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))
 A1  K2  X4 
 99 201  63 

 aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
  SKU Forecast
1  A1   99
2  K2  201
3  X4   63


Note that the wrong value for K2 is returned.

You would either have to pre-sort the data frame before using these approaches:

NewDF - PeriodSKUForecast[with(PeriodSKUForecast, order(SKU, Period)), ]

 NewDF
  Period SKU Forecast
1  1  A1   99
2  2  A1  103
3  3  A1  128
9  1  K2  207
8  2  K2  201
4  1  X4   63
5  2  X4   69
6  3  X4   72
7  4  X4   75

 with(NewDF,tapply(Forecast,SKU,head,1))
 A1  K2  X4 
 99 207  63 


Or consider an approach that does not depend upon the sort order, but which 
subsets based upon the minimum value of Period for each SKU:

do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), 
  function(x) x[which.min(x$Period), ]))
   Period SKU Forecast
A1  1  A1   99
K2  1  K2  207
X4  1  X4   63

or remove the Period column if you don't want it:

 do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), 
function(x) x[which.min(x$Period), -1]))
   SKU Forecast
A1  A1   99
K2  K2  207
X4  X4   63



Regards,

Marc Schwartz


On Mar 15, 2013, at 12:37 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 Try:
 data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)))
 #   Forecast
 #A1   99
 #K2  207
 #X4   63
 
 #or
  aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
 #  SKU Forecast
 #1  A1   99
 #2  K2  207
 #3  X4   63
 
 #or
 library(plyr)
 ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1))
 #  SKU Forecast
 #1  A1   99
 #2  K2  207
 #3  X4   63
 A.K.
 
 
 
 
 - Original Message -
 From: Barry King barry.k...@qlx.com
 To: r-help@r-project.org
 Cc: 
 Sent: Friday, March 15, 2013 1:30 PM
 Subject: [R] Help finding first value in a BY group
 
 I have a large Excel file with SKU numbers (stock keeping units) and
 forecasts which can be mimicked with the following:
 
 Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2)
 SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)
 Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201)
 PeriodSKUForecast - data.frame(Period, SKU, Forecast)
 PeriodSKUForecast
 
   Period SKU Forecast
 1  1  A1   99
 2  2  A1  103
 3  3  A1  128
 4  1  X4   63
 5  2  X4   69
 6  3  X4   72
 7  4  X4   75
 8  1  K2  207
 9  2  K2  201
 
 I need to create a matrix with only the first forecast for each SKU:
 
 A1 99
 X4 63
 K2 207
 
 The Period for the first forecast will always be the minimum value
 for an SKU.
 
 Can anyone suggest how I might accomplish this?
 
 Thank you,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] seeking tip to keep first of multiple observations per ID

2013-03-15 Thread John Kane
Probably the first thing to do is supply some sample data 
See https://github.com/hadley/devtools/wiki/Reproducibility for some 
suggestions.

However you may want to take a look at 
http://stackoverflow.com/questions/13279582/select-only-the-first-rows-for-each-unique-value-of-a-column-in-r
 particularly at answer number 3 which uses the data.table package and  which 
looks like it may do what you want.  

John Kane
Kingston ON Canada


 -Original Message-
 From: jsdroys...@bellsouth.net
 Sent: Fri, 15 Mar 2013 12:06:05 -0400
 To: r-help@r-project.org
 Subject: [R] seeking tip to keep first of multiple observations per ID
 
 Dear R community,
 
 I am a neophyte and I cannot figure out how to accomplish keeping only
 the
 first record for each ID in a data.frame that has assorted numbers of
 records per ID.
 
 I studied and found references to packages plyr and sql for R, and I fear
 the documentation for those was over my head and I could not identify
 what
 may be there to reach my goal.
 
 If someone could point me toward a method I will gladly study
 documentation,
 or if there is an example posted someplace I will follow it.
 
 THANKS!
 Julie
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Spearman rank correlation

2013-03-15 Thread David Winsemius
Rhelp is not here to fill in the gaps in your statistical education. You may 
want to try CrossValidated, but even there you would be expected to do some 
searching both of their website and in you textbooks.


On Mar 15, 2013, at 8:16 AM, Zia mel wrote:

 Hi
 
 If I get a p-value less than 0.05 does that mean there is a
 significant relation between the 2 ranked lists? Sometimes I get a low
 correlation such as 0.3 or even 0.2 and the p-value is so low , such
 as 0.01 , does that mean it is significant also? and would that be
 interpreted as significant low positive correlation or significant
 moderate positive correlation? Also,can R calculate the results for
 lists  30 for spearman or I need to shift to pearson correlation in
 that case? and finally what does the (S=) in the results mean?
 
 Many thanks

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple filled.contour() on single page

2013-03-15 Thread David Winsemius

On Mar 15, 2013, at 10:49 AM, Sebastian P. Luque wrote:

 Hi,
 
 It seems as if filled.contour can't be used along with layout(), or
 par(mfrow) or the like, since it sets the page in a very particular
 manner.  Someone posted a workaround
 (http://r.789695.n4.nabble.com/several-Filled-contour-plots-on-the-same-device-td819040.html).
 Has a better approach been developed for achieving this?  Thanks.

I remember seeing a similar question posted last week that got an informative 
answer (both here and in the crossposting to SO) . Since you are not describing 
what you what to change it is difficult to know what will be a satisfying 
answer, but at the very least you should do a bit more searching.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Source Code Plagiarism Detection

2013-03-15 Thread gyollin
For the classes I teach at the University of Washington, we use:
http://www.compsoftbook.com.  It's an automated scientific computing grading
system that supports R and includes MOSS checking.  However, I would also be
very interested in alternative and possibly stand-alone MOSS-type tools that
can work with R source code.



--
View this message in context: 
http://r.789695.n4.nabble.com/R-Source-Code-Plagiarism-Detection-tp4660803p4661527.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] seeking tip to keep first of multiple observations per ID

2013-03-15 Thread arun
Hi,

If you can dput() a small part of your dataset e.g. 
dput(head(yourdataset),20)), it would be helpful.

Otherwise, 

dat1- data.frame(ID=rep(1:3,times=c(3,4,2)),col2=rnorm(9))
 aggregate(.~ID,data=dat1,head,1)
#  ID   col2
#1  1 -0.0637622
#2  2  1.1782429
#3  3  0.4670021
A.K.



- Original Message -
From: Julie Royster jsdroys...@bellsouth.net
To: r-help@r-project.org
Cc: 
Sent: Friday, March 15, 2013 12:06 PM
Subject: [R] seeking tip to keep first of multiple observations per ID

Dear R community,

I am a neophyte and I cannot figure out how to accomplish keeping only the
first record for each ID in a data.frame that has assorted numbers of
records per ID.

I studied and found references to packages plyr and sql for R, and I fear
the documentation for those was over my head and I could not identify what
may be there to reach my goal.

If someone could point me toward a method I will gladly study documentation,
or if there is an example posted someplace I will follow it.

THANKS!
Julie 

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] latex(test, collabel=) returns wrong latex code?

2013-03-15 Thread Simon Kiss
Hello:
I'm working with a 2-dimensional table that looks sort of like test below.
I'm trying to produce latex code that will add dimension names for both the 
rows and the columns.
In using the following code, latex chokes when I include collabel='Vote' but 
it's fine without it.

The code below prouces the latex code further below.  I'm confused by this, 
because it looks like it's creating two bits of text for each instance of 
\multicolumn.  Is that really allowed in \multicolumn?
Could someone clarify?
Thank you!
Yours, SJK


library(Hmisc)
test-as.table(matrix(c(50,50,50,50), ncol=2))
latex(test, rowlabel='Gender',collabel='Vote', file='')

% latex.default(test, rowlabel = Gender, collabel = vote, file = ) 
%
\begin{table}[!tbp]
\begin{center}
\begin{tabular}{lrr}
\hline\hline
\multicolumn{1}{l}{Gender}\multicolumn{1}{vote}{A}\multicolumn{1}{l}{B}\tabularnewline
\hline
A$50$$50$\tabularnewline
B$50$$50$\tabularnewline
\hline
\end{tabular}
\end{center}
\end{table}
*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

Please avoid sending me Word, PowerPoint or Excel attachments. Sending these 
documents puts pressure on many people to use Microsoft software and helps to 
deny them any other choice. In effect, you become a buttress of the Microsoft 
monopoly.

To convert to plain text choose Text Only or Text Document as the Save As Type. 
 Your computer may also have a program to convert to PDF format. Select File, 
then Print. Scroll through available printers and select the PDF converter. 
Click on the Print button and enter a name for the PDF file when requested.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] seeking tip to keep first of multiple observations per ID

2013-03-15 Thread pedroabg
I had the same problem.



--
View this message in context: 
http://r.789695.n4.nabble.com/seeking-tip-to-keep-first-of-multiple-observations-per-ID-tp4661520p4661534.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get the sign of a value

2013-03-15 Thread MacQueen, Don
And as a footnote to the other replies, see

   help('Math',package='base')

R's online help has a number of topics that are broader than that of a
single function, and that relatively new useRs might not have seen yet.

Examples include
  ?Distributions (compare with ?rnorm)
and
  ?Startup


-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 3/14/13 12:41 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:

Hello! Can't figure it out - hope it's simple:

# I have some value (can be anything), e.g.:
x = 12
# I'd like it to become 1.

# If the value is negative (again, it can be anything), e.g.:
y = -12
# I'd like it to become -1.

How could I do it?
Thanks a lot!

-- 
Dimitri Liakhovitski

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help finding first value in a BY group

2013-03-15 Thread William Dunlap
ddply() is very handy, but sometimes it seems like overkill
to select rows from a dataset by pulling into pieces, selecting
a row from each piece, the pasting the pieces back together
again.  Information like row names can be lost.

The following uses a subscript to pull out the rows of interest.
We compute the subcript with ave(), which does the same sort of
looping that things in plyr do, but it operates on an integer vector
rather than the whole data.frame.
   w - with(PeriodSKUForecast, ave(Period, SKU, FUN=order)) 
   PeriodSKUForecast[w==1,]
Period SKU Forecast
  1  1  A1   99
  4  1  X4   63
  9  1  K2  207
Note that the output rows are in the order they were in in the
input data.frame and their row names come from the input also.
If you want the first two periods for each SKU use the subscript w=2. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Marc Schwartz
 Sent: Friday, March 15, 2013 11:57 AM
 To: arun
 Cc: R help
 Subject: Re: [R] Help finding first value in a BY group
 
 Hi,
 
 There is a potential gotcha with the approach of using head(..., 1) in each 
 of the solutions
 that Arun has below, which is the assumption that the data is sorted, as is 
 the case in the
 example data. It seems reasonable to consider that the real data at hand may 
 not be
 entered in order or presorted.
 
 If the data is not sorted (switching the order of the two K2 related entries):
 
 Period - c(1, 2, 3, 1, 2, 3, 4, 2, 1)
 Forecast - c(99, 103, 128, 63, 69, 72, 75, 201, 207)
 SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)
 
 PeriodSKUForecast - data.frame(Period, SKU, Forecast)
 
  PeriodSKUForecast
   Period SKU Forecast
 1  1  A1   99
 2  2  A1  103
 3  3  A1  128
 4  1  X4   63
 5  2  X4   69
 6  3  X4   72
 7  4  X4   75
 8  2  K2  201
 9  1  K2  207
 
 
  with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))
  A1  K2  X4
  99 201  63
 
  aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
   SKU Forecast
 1  A1   99
 2  K2  201
 3  X4   63
 
 
 Note that the wrong value for K2 is returned.
 
 You would either have to pre-sort the data frame before using these 
 approaches:
 
 NewDF - PeriodSKUForecast[with(PeriodSKUForecast, order(SKU, Period)), ]
 
  NewDF
   Period SKU Forecast
 1  1  A1   99
 2  2  A1  103
 3  3  A1  128
 9  1  K2  207
 8  2  K2  201
 4  1  X4   63
 5  2  X4   69
 6  3  X4   72
 7  4  X4   75
 
  with(NewDF,tapply(Forecast,SKU,head,1))
  A1  K2  X4
  99 207  63
 
 
 Or consider an approach that does not depend upon the sort order, but which 
 subsets
 based upon the minimum value of Period for each SKU:
 
 do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU),
   function(x) x[which.min(x$Period), ]))
Period SKU Forecast
 A1  1  A1   99
 K2  1  K2  207
 X4  1  X4   63
 
 or remove the Period column if you don't want it:
 
  do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU),
 function(x) x[which.min(x$Period), -1]))
SKU Forecast
 A1  A1   99
 K2  K2  207
 X4  X4   63
 
 
 
 Regards,
 
 Marc Schwartz
 
 
 On Mar 15, 2013, at 12:37 PM, arun smartpink...@yahoo.com wrote:
 
  Hi,
  Try:
  data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)))
  #   Forecast
  #A1   99
  #K2  207
  #X4   63
 
  #or
   aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
  #  SKU Forecast
  #1  A1   99
  #2  K2  207
  #3  X4   63
 
  #or
  library(plyr)
  ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1))
  #  SKU Forecast
  #1  A1   99
  #2  K2  207
  #3  X4   63
  A.K.
 
 
 
 
  - Original Message -
  From: Barry King barry.k...@qlx.com
  To: r-help@r-project.org
  Cc:
  Sent: Friday, March 15, 2013 1:30 PM
  Subject: [R] Help finding first value in a BY group
 
  I have a large Excel file with SKU numbers (stock keeping units) and
  forecasts which can be mimicked with the following:
 
  Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2)
  SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)
  Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201)
  PeriodSKUForecast - data.frame(Period, SKU, Forecast)
  PeriodSKUForecast
 
Period SKU Forecast
  1  1  A1   99
  2  2  A1  103
  3  3  A1  128
  4  1  X4   63
  5  2  X4   69
  6  3  X4   72
  7  4  X4   75
  8  1  K2  207
  9  2  K2  201
 
  I need to create a matrix with only the first forecast for each SKU:
 
  A1 99
  X4 63
  K2 207
 
  The Period for the first forecast will always be the minimum value
  for an SKU.
 
  Can anyone suggest how I might accomplish this?
 
  Thank you,
 
 

[R] Fw: Help finding first value in a BY group

2013-03-15 Thread arun
Forgot to cc: to list




- Forwarded Message -
From: arun smartpink...@yahoo.com
To: Marc Schwartz marc_schwa...@me.com
Cc: Barry King barry.k...@qlx.com; Cc: Barry King barry.k...@qlx.com
Sent: Friday, March 15, 2013 3:41 PM
Subject: Re: [R] Help finding first value in a BY group

Thanks Marc for catching that.

You could also use ?ave() 
#unsorted

PeriodSKUForecast[as.logical(with(PeriodSKUForecast,ave(Period,SKU,FUN=function(x)
 x==min(x,-1]
#  SKU Forecast
#1  A1   99
#4  X4   63
#9  K2  207
#sorted

NewDF[as.logical(with(NewDF,ave(Period,SKU,FUN=function(x) x==min(x,-1] 

#SKU Forecast
#1  A1   99
#9  K2  207
#4  X4   63
A.K.




- Original Message -
From: Marc Schwartz marc_schwa...@me.com
To: arun smartpink...@yahoo.com
Cc: Barry King barry.k...@qlx.com; R help r-help@r-project.org
Sent: Friday, March 15, 2013 2:56 PM
Subject: Re: [R] Help finding first value in a BY group

Hi,

There is a potential gotcha with the approach of using head(..., 1) in each of 
the solutions that Arun has below, which is the assumption that the data is 
sorted, as is the case in the example data. It seems reasonable to consider 
that the real data at hand may not be entered in order or presorted.

If the data is not sorted (switching the order of the two K2 related entries):

Period - c(1, 2, 3, 1, 2, 3, 4, 2, 1)
Forecast - c(99, 103, 128, 63, 69, 72, 75, 201, 207)
SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)

PeriodSKUForecast - data.frame(Period, SKU, Forecast)

 PeriodSKUForecast
  Period SKU Forecast
1      1  A1       99
2      2  A1      103
3      3  A1      128
4      1  X4       63
5      2  X4       69
6      3  X4       72
7      4  X4       75
8      2  K2      201
9      1  K2      207


 with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))
A1  K2  X4 
99 201  63 

 aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
  SKU Forecast
1  A1       99
2  K2      201
3  X4       63


Note that the wrong value for K2 is returned.

You would either have to pre-sort the data frame before using these approaches:

NewDF - PeriodSKUForecast[with(PeriodSKUForecast, order(SKU, Period)), ]

 NewDF
  Period SKU Forecast
1      1  A1       99
2      2  A1      103
3      3  A1      128
9      1  K2      207
8      2  K2      201
4      1  X4       63
5      2  X4       69
6      3  X4       72
7      4  X4       75

 with(NewDF,tapply(Forecast,SKU,head,1))
A1  K2  X4 
99 207  63 


Or consider an approach that does not depend upon the sort order, but which 
subsets based upon the minimum value of Period for each SKU:

do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), 
                      function(x) x[which.min(x$Period), ]))
   Period SKU Forecast
A1      1  A1       99
K2      1  K2      207
X4      1  X4       63

or remove the Period column if you don't want it:

 do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), 
                        function(x) x[which.min(x$Period), -1]))
   SKU Forecast
A1  A1       99
K2  K2      207
X4  X4       63



Regards,

Marc Schwartz


On Mar 15, 2013, at 12:37 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 Try:
 data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)))
 #   Forecast
 #A1       99
 #K2      207
 #X4       63
 
 #or
  aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
 #  SKU Forecast
 #1  A1       99
 #2  K2      207
 #3  X4       63
 
 #or
 library(plyr)
 ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1))
 #  SKU Forecast
 #1  A1       99
 #2  K2      207
 #3  X4       63
 A.K.
 
 
 
 
 - Original Message -
 From: Barry King barry.k...@qlx.com
 To: r-help@r-project.org
 Cc: 
 Sent: Friday, March 15, 2013 1:30 PM
 Subject: [R] Help finding first value in a BY group
 
 I have a large Excel file with SKU numbers (stock keeping units) and
 forecasts which can be mimicked with the following:
 
 Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2)
 SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2)
 Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201)
 PeriodSKUForecast - data.frame(Period, SKU, Forecast)
 PeriodSKUForecast
 
   Period SKU Forecast
 1      1  A1       99
 2      2  A1      103
 3      3  A1      128
 4      1  X4       63
 5      2  X4       69
 6      3  X4       72
 7      4  X4       75
 8      1  K2      207
 9      2  K2      201
 
 I need to create a matrix with only the first forecast for each SKU:
 
 A1 99
 X4 63
 K2 207
 
 The Period for the first forecast will always be the minimum value
 for an SKU.
 
 Can anyone suggest how I might accomplish this?
 
 Thank you,


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make the labels of pie chart are not overlapping?

2013-03-15 Thread Jim Lemon

On 03/16/2013 12:58 AM, Tammy Ma wrote:

I have the following dataframe:

Productpredicted_MarketShare  Predicted_MS_Percentage
A2.827450e-02 2.8
B4.716403e-06 0.0
C1.741686e-01 17.4
D   1.716303e-04 0.0
...

Because there are so many products, and most of predicted Market share is 
around 0%.
When I make pie chart, the labels of those product with 0% market share are 
overlapping.
How do I make the labels are not overlapping?


Hi Tammy,
Obviously you have many more products than are shown above. Let us 
assume that their market share is distributed approximately as negative 
binomial and your C value is the maximum. You might have twenty 
products with market shares around:


market_share-c(0,0,0,0,1,1,1,1,2,2,3,3,4,5,5,6,10,11,15,17)
names(market_share)-LETTERS[1:20]

If you try to plot this as a pie chart:

pie(market_share)

you do get a bunch of overprinted labels for the four zero values. Pie 
charts with more than four or five sectors are usually not the best way 
to display the distribution of your values, but if you must:


par(mar=c(5,4,4,4))
pie(market_share,labels=c(rep(,4),names(market_share)[5:20]))
par(xpd=TRUE)
text(1.1,0,A,B,C,D=0)
par(xpd=FALSE)

Good luck with it.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] latex(test, collabel=) returns wrong latex code?

2013-03-15 Thread Duncan Mackay

Hi Simon

the equivalent in xtable is

library(xtable)
xtable(test)
% latex table generated in R 2.15.2 by xtable 1.7-0 package
% Sat Mar 16 08:14:01 2013
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrr}
  \hline
  A  B \\
  \hline
A  50.00  50.00 \\
  B  50.00  50.00 \\
   \hline
\end{tabular}
\end{center}
\end{table}

I am wondering if the class is making things hard

test
   A  B
A 50 50
B 50 50

# as  a data.frame
data.frame(test)
  Var1 Var2 Freq
1AA   50
2BA   50
3AB   50
4BB   50

# Add column names
dimnames(test) - list(c(Gender A, Gender B), c(Vote A, Vote B))
 test
 Vote A Vote B
Gender A 50 50
Gender B 50 50
xtable(test)
xtable(test)
% latex table generated in R 2.15.2 by xtable 1.7-0 package
% Sat Mar 16 08:34:34 2013
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrr}
  \hline
  Vote A  Vote B \\
  \hline
Gender A  50.00  50.00 \\
  Gender B  50.00  50.00 \\
   \hline
\end{tabular}
\end{center}
\end{table}

I suppose a similar thing will happen with latex saves detaching

latex is a bit different in that it gives you multicolumn for the 
header columns which can be modified (I have not used latex) for justification


I think the problem is in the arrangement of the data or the names 
that you are sending to latex() someone else may have a different opinion


HTH

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mac...@northnet.com.au



At 05:33 16/03/2013, you wrote:

Hello:
I'm working with a 2-dimensional table that looks sort of like test below.
I'm trying to produce latex code that will add dimension names for 
both the rows and the columns.
In using the following code, latex chokes when I include 
collabel='Vote' but it's fine without it.


The code below prouces the latex code further below.  I'm confused 
by this, because it looks like it's creating two bits of text for 
each instance of \multicolumn.  Is that really allowed in \multicolumn?

Could someone clarify?
Thank you!
Yours, SJK


library(Hmisc)
test-as.table(matrix(c(50,50,50,50), ncol=2))
latex(test, rowlabel='Gender',collabel='Vote', file='')

% latex.default(test, rowlabel = Gender, collabel = vote, file = )
%
\begin{table}[!tbp]
\begin{center}
\begin{tabular}{lrr}
\hline\hline
\multicolumn{1}{l}{Gender}\multicolumn{1}{vote}{A}\multicolumn{1}{l}{B}\tabularnewline
\hline
A$50$$50$\tabularnewline
B$50$$50$\tabularnewline
\hline
\end{tabular}
\end{center}
\end{table}
*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

Please avoid sending me Word, PowerPoint or Excel attachments. 
Sending these documents puts pressure on many people to use 
Microsoft software and helps to deny them any other choice. In 
effect, you become a buttress of the Microsoft monopoly.


To convert to plain text choose Text Only or Text Document as the 
Save As Type.  Your computer may also have a program to convert to 
PDF format. Select File, then Print. Scroll through available 
printers and select the PDF converter. Click on the Print button and 
enter a name for the PDF file when requested.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nlrob and robust nonlinear regression with upper and/or lower bounds on parameters

2013-03-15 Thread Peter Ehlers

On 2013-03-15 07:57, Shane McMahon wrote:

I have a question regarding robust nonlinear regression with nlrob. I
would like to place lower bounds on the parameters, but when I call
nlrob with limits it returns the following error:

Error in psi(resid/Scale, ...) : unused argument(s) (lower = list(Asym
= 1, mid = 1, scal = 1))

After consulting the documentation I noticed that upper and lower are
not listed as parameter in the nlrob help documentation. I haven't
checked the source to confirm this yet, but I infer that nlrob simply
doesn't support upper and lower bounds.
For my current problem, I only require that the parameters be positive,
so I simply rewrote the formula to be a function of the absolute value
of the parameter. However, I have other problems where I am not so
lucky. Are there robust nonlinear regression methods that support upper
and lower bounds? Or am I simply missing something with nlrob? I've
included example code that should illustrate the issue.

require(stats)
require(robustbase)
Dat - NULL; Dat$x - rep(1:25, 20)
set.seed(1)
Dat$y - SSlogis(Dat$x, 10, 12, 2)*rnorm(500, 1, 0.1)
plot(Dat)
Dat.nls - nls(y ~ SSlogis(x, Asym, mid, scal),
data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1));
Dat.nls
lines(1:25, predict(Dat.nls, newdata=list(x=1:25)), col=1)
Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal),
data=Dat,start=list(Asym=1,mid=1,scal=1)); Dat.nlrob
lines(1:25, predict(Dat.nlrob, newdata=list(x=1:25)), col=2)
Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal),
data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1));
Dat.nlrob



thanks,
Shane


I'm not sure what your example is supposed to illustrate, but the
lower argument in nls() is being ignored. As ?nls says: 'Bounds
can only be used with the port algorithm', which is not the default,
and nls() does issue a warning with your code.

If you want to force a coefficient to be positive, the usual approach
is to estimate the logarithm of the coefficient by using the
exp(log(coef)) construct. See argument 'lrc' in ?SSasymp for example.
Introducing a shift to accommodate coef  k for given k is simple.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nlrob and robust nonlinear regression with upper and/or lower bounds on parameters

2013-03-15 Thread Peter Ehlers


Forgot to mention: You might find the nlmrt package helpful but
I have no experience with that (yet).

Peter Ehlers

On 2013-03-15 07:57, Shane McMahon wrote:

I have a question regarding robust nonlinear regression with nlrob. I
would like to place lower bounds on the parameters, but when I call
nlrob with limits it returns the following error:

Error in psi(resid/Scale, ...) : unused argument(s) (lower = list(Asym
= 1, mid = 1, scal = 1))

After consulting the documentation I noticed that upper and lower are
not listed as parameter in the nlrob help documentation. I haven't
checked the source to confirm this yet, but I infer that nlrob simply
doesn't support upper and lower bounds.
For my current problem, I only require that the parameters be positive,
so I simply rewrote the formula to be a function of the absolute value
of the parameter. However, I have other problems where I am not so
lucky. Are there robust nonlinear regression methods that support upper
and lower bounds? Or am I simply missing something with nlrob? I've
included example code that should illustrate the issue.

require(stats)
require(robustbase)
Dat - NULL; Dat$x - rep(1:25, 20)
set.seed(1)
Dat$y - SSlogis(Dat$x, 10, 12, 2)*rnorm(500, 1, 0.1)
plot(Dat)
Dat.nls - nls(y ~ SSlogis(x, Asym, mid, scal),
data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1));
Dat.nls
lines(1:25, predict(Dat.nls, newdata=list(x=1:25)), col=1)
Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal),
data=Dat,start=list(Asym=1,mid=1,scal=1)); Dat.nlrob
lines(1:25, predict(Dat.nlrob, newdata=list(x=1:25)), col=2)
Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal),
data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1));
Dat.nlrob



thanks,
Shane

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dispersion indicator for clustered data

2013-03-15 Thread Martin Batholdy
Hi,

I have a dataset with clustered data (observations within groups) and would 
like to make some descriptive plots.

Now, I am a little bit lost on how to present the dispersion of the data (what 
kind of residuals to plot).
I could compute the standard error of the mean (SEM) ignoring the clustering 
(very low values and misleading) or I could first aggregate the data by 
calculating th mean for each group and calculate the SEM for this means.
But I am not so sure what implication these two approaches have. In the end, I 
take the clustering into account by fitting a random-intercept regression model 
– however for plotting I would like to have a descriptive dispersion indicator 
of the data.

Now, I heard a lot about 'clustered' or 'robust' standard errors.
Is there some kind of correction I can apply to the simple SEM formula 
(sd(x)/sqrt(m)) to take care of correlated observations within clusters?
Or are there bootstrapping or jackknife approaches implemented in R (or cran 
package) which give me unbiased variance estimation for clustered data?

thanks for any suggestions!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a good tutorial on ff package

2013-03-15 Thread Michael Sumner
The documentation is very straightforward, I suggest you describe what you
want to do in more detail and what you don't understand about the functions
when you try to use them. You basically create an array with ?ff or
data.frame with ?ffdf and proceed from there - each page has examples.  All
I've ever done is make big objects and populate them in chunks, which is
really natural and easy for an array. I did have specific memory caching
issues on Windows when the object exceeds available memory, but still that
was straightforward to describe and ask about specifically.

I've seen a lot of discussion on bigmemory being easier to understand, but
I find that is far more limiting. Do a search on ff tutorial and explore
the packages that rely on ff after working through the basic help pages in
the package itself.

The CRAN Task View High Performance Computing has an overview of the
topic for a suite of packages (though it does not mention the tools in
raster/rgdal, which are very good if you happen to use spatial grid data).

Cheers, Mike.




On Fri, Mar 15, 2013 at 11:18 PM, Fritz Zuhl r_listse...@zuhl.org wrote:

 Hi,
 I am looking for a good tutorial on the ff package. Any suggestions?

 Also, any other package would anyone recommend for dealing with data that
 extends beyond the RAM would be greatly appreciated.

 Thanks,
 Fritz Zuhl

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Sumner
Hobart, Australia
e-mail: mdsum...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identifying and drawing from T distribution

2013-03-15 Thread ivo welch
dear R experts:

fitdistr suggests that a t with a mean of 1, an sd of 2, and 2.6
degrees of freedom is a good fit for my data.

now I want to draw random samples from this distribution.should I
draw from a uniform distribution and use the distribution function
itself for the transform, or is there a better way to do this?   there
is a non-centrality parameter ncp in rt, but one parameter ncp cannot
subsume two (m and s), of course.  my first attempt was to draw
rt(..., df=2.63)*s+m, but this was obviously not it.

advice appreciated.

/iaw


Ivo Welch (ivo.we...@gmail.com)
http://www.ivo-welch.info/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identifying and drawing from T distribution

2013-03-15 Thread Joshua Wiley
Hi Ivo,

Try something like this:

rt(1e5, df = 2.6, ncp = (1 - 0) * sqrt(2.6 + 1)/2)

The NCP comes from the mean, N, and SD.  See ?rt

Cheers,

Josh



On Fri, Mar 15, 2013 at 6:58 PM, ivo welch ivo.we...@anderson.ucla.edu wrote:
 dear R experts:

 fitdistr suggests that a t with a mean of 1, an sd of 2, and 2.6
 degrees of freedom is a good fit for my data.

 now I want to draw random samples from this distribution.should I
 draw from a uniform distribution and use the distribution function
 itself for the transform, or is there a better way to do this?   there
 is a non-centrality parameter ncp in rt, but one parameter ncp cannot
 subsume two (m and s), of course.  my first attempt was to draw
 rt(..., df=2.63)*s+m, but this was obviously not it.

 advice appreciated.

 /iaw

 
 Ivo Welch (ivo.we...@gmail.com)
 http://www.ivo-welch.info/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about nls

2013-03-15 Thread Prof J C Nash (U30A)
I decided to follow up my own suggestion and look at the robustness of 
nls vs. nlxb. NOTE: this problem is NOT one that nls() would usually be 
applied to. The script below is very crude, but does illustrate that 
nls() is unable to find a solution in 70% of tries where nlxb (a 
Marquardt approach) succeeds.


I make no claim for elegance of this code -- very quick and dirty.

JN



debug-FALSE
library(nlmrt)
x-c(60,80,100,120)
y-c(0.8,6.5,20.5,45.9)
mydata-data.frame(x,y)
mydata
xmin-c(0,0,0)
xmax-c(8,8,8)
set.seed(123456)
nrep - as.numeric(readline(Number of reps:))
pnames-c(a, b, d)
npar-length(pnames)
# set up structure to record results
#  need start, failnls, parnls, ssnls, failnlxb, parnlxb, ssnlxb
tmp-matrix(NA, nrow=nrep, ncol=3*npar+4)
outcome-as.data.frame(tmp)
rm(tmp)
colnames(outcome)-c(paste(st-,pnames[[1]],''),
paste(st-,pnames[[2]],''),
paste(st-,pnames[[3]],''),
failnls, paste(nls-,pnames[[1]],''),
paste(nls,pnames[[1]],''),
paste(nls-,pnames[[1]],''), ssnls,
failnlxb, paste(nlxb-,pnames[[1]],''),
paste(nlxb,pnames[[1]],''),
paste(nlxb-,pnames[[1]],''), ssnlxb)


for (i in 1:nrep){

cat(Try ,i,:\n)

st-runif(3)
names(st)-pnames
if (debug) print(st)
rnls-try(nls(y ~ exp(a + b*x)+d,start=st, data=mydata), silent=TRUE)
if (class(rnls) == try-error) {
   failnls-TRUE
   parnls-rep(NA,length(st))
   ssnls-NA
} else {
   failnls-FALSE
   ssnls-deviance(rnls)
   parnls-coef(rnls)
}
names(parnls)-pnames
if (debug) {
  cat(nls():)
  print(rnls)
}
rnlxb-try(nlxb(y ~ exp(a + b*x)+d,start=st, data=mydata), silent=TRUE)
if (class(rnlxb) == try-error) {
   failnxlb-TRUE
   parnlxb-rep(NA,length(st))
   ssnlxb-NA
} else {
   failnlxb-FALSE
   ssnlxb-rnlxb$ssquares
   parnlxb-rnlxb$coeffs
}
names(parnls)-pnames
if (debug) {
  cat(nlxb():)
  print(rnlxb)
  tmp-readline()
  cat(\n)
  }
 solrow-c(st, failnls=failnls, parnls, ssnls=ssnls,
 failnlxb=failnlxb, parnlxb, ssnlxb=ssnlxb)
  outcome[i,]-solrow
} # end loop

cat(Proportion of nls  runs that failed = ,sum(outcome$failnls)/nrep,\n)
cat(Proportion of nlxb runs that failed = 
,sum(outcome$failnlxb)/nrep,\n)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new question

2013-03-15 Thread arun
Hi,
Try this:

directory- /home/arunksa111/dados 
#modified the function
GetFileList - function(directory,number){
 setwd(directory)
 filelist1-dir()
    lista-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = TRUE, recursive = TRUE)
 output- list(filelist1,lista)
 return(output)
    }

 file.list.names-GetFileList(directory,23)[[1]]
 lista-GetFileList(directory,23)[[2]]
FacGroup-c(0,1,0,2,2,0,3)


ReadDir-function(FacGroup){
 list.new-lista[FacGroup!=0]
 read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
 names(read.list)-file.list.names[FacGroup!=0]
 return (read.list)
} 
ListFacGroup-ReadDir(FacGroup)

z.boxplot- function(lst){
new.list-  lapply(lst,function(x) x[x$FDR0.01,])
pdf(VeraBP.pdf)
lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))
dev.off()
}
z.boxplot(ListFacGroup)

A.K.


From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Friday, March 15, 2013 2:08 PM
Subject: Re: new question


Sorry, you could give me a small new help?

Using the same data, I need a boxplot by groups.

I write he the functions I'm using. The last (z.boxplot is what I need, the 
other is ok). Thank you one more time.

GetFileList - function(directory,number){
 setwd(directory)
 filelist1-dir()[file.info(dir())$isdir]
    direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
 direct-lapply(direct,function(x) paste(directory,/,x,sep=))
    lista-unlist(direct)
 output- list(filelist1,lista)
 return(output)
    }

ReadDir-function(FacGroup){
 list.new-lista[FacGroup!=0]
 read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
 names(read.list)-file.list.names[FacGroup!=0]
 return (read.list)
} 

directory-C:/Users/Vera Costa/Desktop/dados.lixo
 file.list.names-GetFileList(directory,23) [[1]]
 lista-GetFileList(directory,23) [[2]]
FacGroup-c(0,1,0,2,2,0,3)
ListFacGroup-ReadDir(FacGroup)
#zPValues(ListFacGroup,FacGroup) 

z.boxplot - function(lista) {
#I need eliminate all data with FDR0.01
new.list-lista[FDR0.01]
#boxplots split by groups 
boxplot(FDR ~ z, data = dct1,  xlab = Charge, ylab = 
FDR,main=(paste(t,i)))
 }
z.boxplot(ListFacGroup)




2013/3/13 Vera Costa veracosta...@gmail.com

No problem!
Sorry my questions.



2013/3/13 arun smartpink...@yahoo.com

As I mentioned earlier, I don't find it useful to do anova on that kind of 
data.  Previously, I tried with chisq.test also.  It gave warnings() and then 
you responded that it is not correct.  I would suggest you to dput an example 
dataset of the specific columns  that you want to compare (possibly by row) 
and post in the R-help list.  If you get any reply, then you can implement it 
on your whole list of files.  Sorry, today, I am busy.    








From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Wednesday, March 13, 2013 9:43 AM

Subject: Re: new question


Ok. Thank you.
Could you help me to apply this?



2013/3/13 arun smartpink...@yahoo.com

you are comparing one datapoint to another.  It doesn't make sense.  For 
anova, you need replications to calculate df.  may be you could try 
chisq.test.








From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Wednesday, March 13, 2013 8:56 AM

Subject: Re: new question


I agree with you.

I write this tests because I need to compare with some test. I agree is not 
very correct, but what is bioconductor?I need to eliminate some data (rows) 
not very significant based in some statistics. What about your idea? How can 
I do this?



2013/3/13 arun smartpink...@yahoo.com

Ok.



I need a t test (it's in this function). But I need a chisq.test corrected 
and a Anova with data in attach.

What do you mean by this?

Though, I calculated the t test based on comparing a single value against 
another for each row, I don't think it makes sense statistically.  Here, 
you are estimating the mean by just one value, which then is the mean value 
and comparing it with another value.  It doesn't make much sense.  I think 
in bioconductor there are some packages which do this kind of comparison (I 
don't remember the names).  Also, I am not sure what kind of inference you 
want from chisquare test.  Also, from anova test (?using just 2 datapoints) 
(if the comparison is rowwise).





From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Tuesday, March 12, 2013 6:04 PM

Subject: Re: new question


Ok. It isn't the last code...
You sent me this code

directory- /home/arunksa111/data.new
#first function
filelist-function(directory,number,list1){
setwd(directory)
filelist1-dir(directory)

direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)


Re: [R] missing values in an array

2013-03-15 Thread Ray Cheung
Thank you very much. Arun's reply is exactly what I need. Thank you once
again!~

ray

On Sat, Mar 16, 2013 at 12:31 AM, Berend Hasselman b...@xs4all.nl wrote:


 On 15-03-2013, at 17:08, Ray Cheung ray1...@gmail.com wrote:

  Dear All,
 
  I've an array with some missing values (NA) in between. I want to remove
  that particular matrix if a missing value is detected. How can I do so?
  Thank you very much.


 It is not clear what the dimension of your array is.

 If your array/matrix is two dimensional, then then

 any(is.na(A))  # A is the name of the array/matrix

 will return TRUE is at least one element of A is NA. And then you can
 delete A.

 If you array has three dimensions then you'll have to look at arun's
 solution.

 Berend

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.