Re: [R] Synthetic Control Method

2024-04-17 Thread Petr Pikal
Hallo Nadja

Similar as Bert I do not know how the function works. From the help page -
synth.data - is used in example. Check if your data structure is consistent
with synth.data by comparing str(synth.data) and str( INVESTMENTVOLUME).
Names of columns should be in both cases match the names in dataprep call
and type of columns should be also same.

Cheers.
Petr




Neobsahuje
žádné viry.www.avast.com

<#m_7252664788356473291_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

út 16. 4. 2024 v 9:58 odesílatel  napsal:

> Good Morning
>
>
>
> I want to perform a synthetic control method with R. For this purpose, I
> created the following code:
>
>
>
> # Re-load packages
>
> library(Synth)
>
> library(readxl)
>
>
>
> # Pfadeinstellung Excel-Blatt
>
> excel_file_path <-
> ("C:\\Users\\x\\Desktop\\DATA_INVESTMENTVOLUMEN_FOR_R_WITHOUT_NA.xlsx")
>
>
>
> # Load the Excel file
>
> INVESTMENTVOLUME <- read_excel(excel_file_path)
>
>
>
> # Anzeigen des gesamten Dataframes
>
> print(INVESTMENTVOLUME)
>
>
>
> # Make sure BFS is numeric right before dataprep
>
> INVESTMENTVOLUME$BFS <- as.numeric(INVESTMENTVOLUME$BFS)
>
>
>
> # running dataprep
>
> dataprep_out <- dataprep(
>
>   foo = INVESTMENTVOLUME,
>
>   predictors = c("Predictor 1", " Predictor 2", " Predictor 3" Predictor
> 4",
> " Predictor 5", " Predictor 6"),
>
>   special.predictors = list(list("Special Predictor 1", seq(1, 12, by =
> 1))),
>
>   dependent = "INVESTMENTVOLUME_12_MONTH_AVERAGE",
>
>   unit.variable = "BFS",
>
>   time.variable = "DATE",
>
>   treatment.identifier = ,
>
>   controls.identifier =
> unique(INVESTMENTVOLUME$BFS[-which(INVESTMENTVOLUME$BFS == )]),
>
>   time.predictors.prior = as.Date("2010-01-01"):as.Date("2017-10-01"),
>
>   time.optimize.ssr = as.Date("2010-01-01"):as.Date("2017-10-01"),
>
>   time.plot = as.Date("2010-01-01"):as.Date("2024-03-01"),
>
>   unit.names.variable = "BFS"
>
> )
>
>
>
> synth_out <- synth(
>
>   data.prep.obj = dataprep_out
>
> )
>
>
>
> I keep getting the same error message. Unfortunately, ChatGPT and solutions
> from various forums do not help. My unit variables are all numeric except
> one, which is a date and has POSIXct as type.
>
> Fehler in dataprep(foo = INVESTMENTVOLUME, predictors = c("Predictor 1",
> :
>
>
>
>  unit.variable not found as numeric variable in foo.
>
>
>
> I would be very grateful if you could help me with my problem.
>
>
>
> Thank you in advance for your efforts.
>
>
>
> Kind regards
>
> Nadja Delliehausen
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Network issue

2024-02-21 Thread Petr Pikal
Hallo James

Just a wild guess, are your problems connected with change of default
download method from wininet to libcurl?

Cheers
Petr

út 20. 2. 2024 v 18:24 odesílatel James Powell  napsal:

>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help to program my function

2012-08-09 Thread Petr PIKAL
I am not allowed to connect to Nabble from work, sorry.

Petr

 
 hi peter the pdf folder is in
  http://r.789695.n4.nabble.com/file/n4639434/aj.pdf
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/help-to-
 program-my-function-tp4639434p4639629.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NADA Package: Referencing Data Frame Columns

2012-08-09 Thread Petr PIKAL
Hi

 
  Specifically, since it has only a single detection indicator column
  (ceneq1), it implies that within any single sample either all the 
analytes
  were detected, or all were not. Not what I would expect.
 
 Don,
 
I have been thinking about this and wondered whether the cast format 
was
 appropriate just for the reason there's only a single censored 
indicator.
 I'm glad you confirmed this as the/one problem.
 
  As to your larger question of which layout is appropriate for use with
  NADA functions, the answer is that either can be used. The trick is 
to
  use the appropriate syntax to extract the values needed to pass the 
data
  to a NADA function.
 
And I've not discovered this in the time I spent trying different 
syntax.
 
  For the long format you subset the rows, then pass the appropriate
  columns. Here's one way:
 
with(subset(chem, param=='AgDis') , ros(quant,ceneq1))
 
This makes a lot of sense. I was thinking that I needed to create 
separate
 data frames for each parameter but subsetting on the fly is much more
 efficient and elegant.
 
  Hope this helps.
 
It certainly does! Thanks.
 
  (p.s., I still think you'll be better off in the long run if you store
  site, param, and maybe era, as character objects, not factors.)
 
I need to research this because I thought that factors were character
 objects used to partition quantities into groups.

It is partly a matter of opinion. I personally prefer factors as they 
seems to me handier.

 a-sample(letters[1:5], 20, replace=T)
 a
 [1] c e c d c d b d d d b a d b c e e 
c c
[20] c

Suppose I want to change all c and b to f

 a.f-as.factor(a)
 a.f
 [1] c e c d c d b d d d b a d b c e e c c c
Levels: a b c d e
 levels(a.f)
[1] a b c d e

 In factors I can only change appropriate levels.

 levels(a.f)[2:3]-f
 a.f
 [1] f e f d f d f d d d f a d f f e e f f f
Levels: a f d e

Regards
Petr




 
 Regards,
 
 Rich
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help, please! matrix operations inside 3 nested loops

2012-08-09 Thread Petr PIKAL
Hi

 thank you for your help.
 
 my input data looks like this (tab separated):
 
 Ind.nr.   Pop.nr.   scm266   rms1280   scm247   rms1107
 1   101   305   318   222   135
 1   101   305   318   231   135
 2   101   305   313   999   96
 2   101   305   321   999   130
 3   101   305   324   231   135
 3   101   305   324   231   135
 4   101   305   313   230   126
 4   101   305   313   230   135
 6   101   305   313   231   135
 6   101   305   321   231   135

Better to use dput(your.data) for sharing data. Anyway I am still confused 
but you probably are able to clarify things further.

 
 it is a dataset with genetic marker alleles for single individuals. 
 the first row is the header, all following rows are individuals. 2 rows
 count for 1 individual.
 first colum is the individual's number, second colum is the number for 
the
 population the individual comes from, and all following colums are 
different
 genetic markers.
 
 what i want to do with this data in R, is to compare one individual with

In those 2 rows for one individual sometimes the genetic marker differs

 test[1:2, scm247]
[1] 222 231

What do you want to do with them?

 each of the other individuals, allele-wise. there are five 
possibilities:
 the two compared individuals share 4,3,2,1,0 alleles of the currently
 examined marker (=colum). for each shared allele this pair of 
individuals
 shall get 1 scoring point. for each pair of individuals, all scoring 
points
 shall be summarized over all markers.

Based on your example, 

 dput(test)
structure(list(Ind.nr. = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 6L, 
6L), Pop.nr. = c(101L, 101L, 101L, 101L, 101L, 101L, 101L, 101L, 
101L, 101L), scm266 = c(305L, 305L, 305L, 305L, 305L, 305L, 305L, 
305L, 305L, 305L), rms1280 = c(318L, 318L, 313L, 321L, 324L, 
324L, 313L, 313L, 313L, 321L), scm247 = c(222L, 231L, 999L, 999L, 
231L, 231L, 230L, 230L, 231L, 231L), rms1107 = c(135L, 135L, 
96L, 130L, 135L, 135L, 126L, 135L, 135L, 135L)), .Names = c(Ind.nr., 
Pop.nr., scm266, rms1280, scm247, rms1107), class = 
data.frame, row.names = c(NA, 
-10L))

what is your desired result?

Regards
Petr


 
 
 my code again, modified according to your suggestions:
 
 #1) read in data:
 daten-read.table('K:/Analysen/STRUCTURE/test.txt', header=TRUE, 
sep=\t)
 daten-as.data.frame(daten)
 
 #2) create empty matrix:
 indxind-matrix(0,nrow=617, ncol=617) 
 indxind[1:20,1:19]
 
 #3) compare cells to each other, score:
 #for the whole dataset: s in 3:34, z1 in 1:617, z2 in 1:617
 for (s in 3:6) {   #walks though the matrix colum by colum, starting at
 colum 3
   for (z1 in 1:6) {  #for each current colum, take one row (z1)...
 for (z2 in 1:6) {  #...and compare it to another row (z2) of the 
current
 colum
   if (z1!=z2) {topf-indxind[z1,z2]
if (daten[2*z1-1,s]==daten[2*z2-1,s]) topf-topf+1 
 #actually, 2 rows make up 1 individual,
if (daten[2*z1-1,s]==daten[2*z2,s]) topf-topf+1 
 #therefore i compare 2 rows
if (daten[2*z1,s]==daten[2*z2-1,s]) topf-topf+1 
 #with another 2 rows
if (daten[2*z1,s]==daten[2*z2,s]) topf-topf+1
indxind[z1,z2]-topf
indxind[z2,z1]-topf
   }
   #print(c(s,z1,z2,indxind[1,2])) ##counts s, z1 and z2 properly, 
but
 gives always 8 for indxind[1,2]
 }
 #indxind[1:5,1:5] #empty matrix
   }
   #indxind[1:5,1:5] #empty matrix
 }
 
 #4) check:
 indxind[1:5,1:5]
 
 
 
 @ Michael Weylandt: i've done my best with regard to the big picture 
of my
 algorithm and the small reproducible example. i hope both is sufficient.
 @ Petr Pikal-3: in this case, there are only numerical values, but it's 
a
 useful hint for my other codes.
 @ Petr Pikal-3 and Berend Hasselman: initializing indxind with 0's 
instead
 of NAs helps, it fills something in indxind now. but it does the 
calculation
 only for the first marker (colum 3), afterwards i get an error: 
 Fehler in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf - topf 
+ 
 : 
   Fehlender Wert, wo TRUE/FALSE nötig ist
 Error in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf - topf 
+  :
   Missing value, where TRUE/FAlse is required
 Has this something to do with the changing to 
daten-as.data.frame(daten) in
 line 3 (instead of as.matrix before)?
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/help-please-
 matrix-operations-inside-3-nested-loops-tp4639592p4639730.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting

Re: [R] help, please! matrix operations inside 3 nested loops

2012-08-09 Thread Petr PIKAL
Hi

 all problems solved. thank you for your help!
 for the sake of completeness, here my solution:
 #1) read in data:
 daten-read.table('K:/Analysen/STRUCTURE/test.txt', header=TRUE, 
sep=\t)
 daten-as.data.frame(daten)

not needed, daten is already data frame

 
 #2) create empty matrix:
 indxind-matrix(0,nrow=617, ncol=617) 
 #indxind[1:20,1:19]
 
 #3) compare cells to each other, score:
 #for the whole dataset: s in 3:34, z1 in 1:617, z2 in 1:617
 z1-1 #running variable for rows in daten
 z2-1 #running variable for rows in daten
 l1-1 #running variable for rows in indxind
 l2-1 #running variable for rows in indxind
 for (s in 3:6) {   #walks though the matrix colum by colum, starting at
 colum 3
 while (z111) {  #for each current colum, take one row
 (z1)...
 while (z211) {  #...and compare it to
 another row (z2) of the current colum
   if (z1!=z2) {
   l1
 
 topf-indxind[l1,l2]
   if
 (daten[z1,s]==daten[z2,s]) topf-topf+1   #actually, 2 rows make up 1
 individual,
   if
 (daten[z1,s]==daten[z2+1,s]) topf-topf+1  #therefore i compare 2 
rows
   if
 (daten[z1+1,s]==daten[z2,s]) topf-topf+1  #with another 2 rows
   if
 (daten[z1+1,s]==daten[z2+1,s]) topf-topf+1
 
 indxind[l1,l2]-topf
   }
   z2-z2+2
   l2-l2+1
   }
 z2-1
 l2-1
 z1-z1+2
 l1-l1+1
   }
 z1-1
 l1-1
}
 
 #4) check:
 indxind[1:5,1:5]

I believe that above cycles can be simplified, maybe by changing your 
daten to three dimensional array or some clever **ply construction but if 
your loops works it is not probably worth en effort.

Regards
Petr

 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/help-please-
 matrix-operations-inside-3-nested-loops-tp4639592p4639744.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] process evaluation packages (slightly off topic)

2012-08-08 Thread Petr PIKAL
Thanks Bert

However this package is quite simple and actually it is just only plotting 
routine for already prepared list. This can be done by reading appropriate 
file but need a step before - prepare timeline from oriented graph which I 
did not find yet and the only way I know is pencil/paper and manual 
transformation of such oriented graph to a kind of Gantt diagramme.

Thank you again anyway. Maybe in some time the plan package will evolve to 
more complex routine.

Dest regards
Petr

 
 Search on R package Gantt, where you will find, among others,
 
 http://cran.r-project.org/web/packages/plan/plan.pdf
 
 -- Bert
 
 On Tue, Aug 7, 2012 at 1:28 AM, Petr PIKAL petr.pi...@precheza.cz 
wrote:
  Dear all
 
  I need to perform some process evaluation. Sorry for not posting data 
and
  code - I do not have any, I ask only for pointing me to correct 
direction.
 
  Suppose I have several connected processes P1, P2, ..., Pn. Each 
process
  takes some time and have some capacity (let say like preparing a 
dinner
  for several persons - only one stove, limited capacity of utensils,
  heating and cooling takes some time) and some processes can by cyclic 
(fry
  onion in pan, put it aside, in the same pan fry meat, put an onion and
  some water and simmer for a while...).
 
  I can prepare some oriented graph (paper/pencil) or Word or drawing
  programme, I can also evaluate whole process by shading spreadsheet 
cells
  but those two tasks are not connected.
 
  Is there any R package/other software suitable for simplifying or 
helping
  in such tasks? E.g. When I prepare oriented graph with capacity and 
time
  for each node is there any automatic way to transfer this graph to
  timeline to see how long whole process will take, where are 
bottlenecks or
  so?
 
  Thank you
 
  Petr
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
 biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decimal number

2012-08-08 Thread Petr PIKAL
Probably a problem in your setting or envirenment or print function

Here is what I get
 x1-64.90
 x2-17.7025
 c(x1,x2)
[1] 64.9000 17.7025
 x-c(x1,x2)
 x
[1] 64.9000 17.7025


Regards
Petr


 
 HI
 
 i have a little problem please help me to solve it
 
 this is the code in R:
 
  beta0
 [1] 64.90614
  beta1
 [1] 17.7025
  beta
 [1] 17 64
 
 her beta- c(beta0, beta1)
 
 thank you in advance
 hafida
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Decimal-
 number-tp4639428.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] process evaluation packages (slightly off topic)

2012-08-08 Thread Petr PIKAL
Thanks Rui

igraph seems to be more related to my problem, so I will try to play with 
it.

Regards

Petr

 
 Hello,
 
 There's also, apparently for simple needs, plotrix::gantt.chart
 See an example in
 http://addictedtor.free.fr/graphiques/sources/source_74.R
 
 To transfer graphs to timelines, etc, (or anything graph related) 
 package igraph.
 
 Hope this helps,
 
 Rui Barradas
 
 Em 07-08-2012 14:46, Bert Gunter escreveu:
  Search on R package Gantt, where you will find, among others,
 
  http://cran.r-project.org/web/packages/plan/plan.pdf
 
  -- Bert
 
  On Tue, Aug 7, 2012 at 1:28 AM, Petr PIKAL petr.pi...@precheza.cz 
wrote:
  Dear all
 
  I need to perform some process evaluation. Sorry for not posting data 
and
  code - I do not have any, I ask only for pointing me to correct 
direction.
 
  Suppose I have several connected processes P1, P2, ..., Pn. Each 
process
  takes some time and have some capacity (let say like preparing a 
dinner
  for several persons - only one stove, limited capacity of utensils,
  heating and cooling takes some time) and some processes can by cyclic 
(fry
  onion in pan, put it aside, in the same pan fry meat, put an onion 
and
  some water and simmer for a while...).
 
  I can prepare some oriented graph (paper/pencil) or Word or drawing
  programme, I can also evaluate whole process by shading spreadsheet 
cells
  but those two tasks are not connected.
 
  Is there any R package/other software suitable for simplifying or 
helping
  in such tasks? E.g. When I prepare oriented graph with capacity and 
time
  for each node is there any automatic way to transfer this graph to
  timeline to see how long whole process will take, where are 
bottlenecks or
  so?
 
  Thank you
 
  Petr
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help to program my function

2012-08-08 Thread Petr PIKAL
Hi

Maybe it is time for you to read some basic stuff like R intro. It seems 
to me that you expect R to behave like some other language you know but 
probably your expectation is wrong.

See inline

 
 HI
 
 i have a problem please help me to solve it: 
 http://r.789695.n4.nabble.com/file/n4639434/aj.pdf aj.pdf 
 
 i want to calculate the vecteur a[j] where j: 1...8
 
 this is the code in R:
 
 aj.fun - function(j, i, X, z, E, beta0, beta1){
 + n - length(X)
 + iX - order(X)
 + iz - order(z)
 + e1 - -(beta)*z[ iz[1:(i - 1)] ]

where do you get beta

 + numer - E[j] - sum( X[ iX[1:(i - 1)] ] * exp(e1) )
 + e2 - -(beta)*z[ iz[i:n] ]
 + denom - sum( exp(e2) )
 + numer/denom
 + }
 
  iX-order(X)
  iX
  [1] 75 37 29 60 73 20 69 55 30 70 72 38 26 35 65 61 74 50 71 57 25 54 
64 76
 56
 [26] 58 48 67 46 63 28 62 36 49 47 66  1 42 41 19 39 43 22 51 68 33 27 9 
15
 11
 [51] 10 59 32 40 45 44 52 16 18 34  4 53 21 23 31  7  6 13 14 12 17 24 5 
 8 
 2
 [76]  3
 
  iZ-order(Z)
  iZ
  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 
23 24
 25
 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
48 49
 50
 [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 
73 74
 75
 [76] 76
 
 e1 - -(beta)*Z[ iZ[1:(i - 1)] ]
 Warning message:
 In 1:(i - 1) : numerical expression has 76 elements: only the first used

As somebody already mentioned i is probably vector and in this case only 
first value is taken. i seems to have the firs value 3.

  e1
 [1]  -442 -1664
 
  numer - E[j] - sum( X[ iX[1:(i - 1)] ] * exp(e1))
 Warning message:
 In 1:(i - 1) : numerical expression has 76 elements: only the first used
  numer
 [1] 9.5 9.5 9.5 9.5 9.5 9.5 9.5 9.5

Here j is vector of 8 values therefore 8 values

 
  e2 - -(beta)*Z[ iZ[i:n] ]
 Warning message:
 In i:n : numerical expression has 76 elements: only the first used
  e2
  [1]  -442 -1664  -442 -1792  -476 -1792  -476 -1792  -510 -1920  -510 
-1920
 [13]  -510 -1920  -510 -1920  -510 -1920  -510 -2048  -544 -2048  -544 
-2048
 [25]  -544 -2048  -544 -2048  -544 -2048  -544 -2048  -544 -2048  -578 
-2176
 [37]  -578 -2176  -578 -2176  -578 -2176  -578 -2176  -578 -2176  -578 
-2176
 [49]  -578 -2176  -578 -2176  -578 -2304  -612 -2304  -612 -2304  -612 
-2304
 [61]  -612 -2304  -612 -2304  -612 -2304  -612 -2304  -646 -2432  -646 
-2432
 [73]  -646 -2432  -646 -2432

Strange, here first value of i seems to be 1 as n shall be 76 and final e2 
length is 76. 

  denom - sum( exp(e2) )


  numer/denom
 [1] 4.313746e+192 4.313746e+192 4.313746e+192 4.313746e+192 
4.313746e+192
 [6] 4.313746e+192 4.313746e+192 4.313746e+192
 
 my problem that the vecteur a[j] could not have the same number!!!

I do not understand. Your numer is 9.5 repeted 8 times. If you divide it 
by one number you will get nine times the same number.

You send us a code but no data so it is difficult to understand what is 
your goal. It would be better to send input data

j, i, X, z, E, beta0, beta1

and assumed result in whole not in chunks scattered in several mails.

Regards
Petr


 
 
 thank you in advance
 hafida
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/help-to-
 program-my-function-tp4639434.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: help, please! matrix operations inside 3 nested loops

2012-08-08 Thread Petr PIKAL
Hi

 
 hello, this is my script:
 
 #1) read in data:
 
daten-read.table('K:/Analysen/STRUCTURE/input_STRUCTURE_tab_excl_5_282_559.txt',
 header=TRUE, sep=\t)
 daten-as.matrix(daten)

If there is any column with nonnumeric values it will transfer all numeric 
values from daten data.frame to character values.

 
 #2) create empty matrix:
 indxind-matrix(nrow=617, ncol=617) 
 indxind[1:20,1:19]
 
 #3) compare cells to each other, score:
 for (s in 3:34) {   #walks though the matrix colum by colum, starting at
 colum 3
   for (z1 in 1:617) {  #for each current colum, take one row (z1)...
 for (z2 in 1:617) {  #...and compare it to another row (z2) of the
 current colum
   if (z1!=z2) {topf-indxind[z1,z2]
if (daten[2*z1-1,s]==daten[2*z2-1,s]) topf-topf+1 
 #actually, 2 rows make up 1 individual,
if (daten[2*z1-1,s]==daten[2*z2,s]) topf-topf+1 
 #therefore i compare 2 rows
if (daten[2*z1,s]==daten[2*z2-1,s]) topf-topf+1 
 #with another 2 rows
if (daten[2*z1,s]==daten[2*z2,s]) topf-topf+1
indxind[z1,z2]-topf
indxind[z2,z1]-topf
   }

The above code is rather clumsy and it is difficult to understand what it 
shall do without extensive study. AFAIU you first set topf to NA and then 
try to add 1 to topf. The result is again NA regardless of your 
sophisticated z constuction. Therefore you are just computing NA in each 
cycle, so you can not expect other result them NA.


   #print(c(s,z1,z2,indxind[1,2])) ##counts s, z1 and z2 properly, 
but
 gives NA for indxind[1,2]
   }
 #indxind[1:5,1:5] #empty matrix
   }
   #indxind[1:5,1:5] #empty matrix
   }
 
 #4) check:
 indxind[1:5,1:5]
 
 this results no errors, but my matrix indxind remains empty (only NAs).
 though all columns and rows are counted properly. R needs quite a while 
to
 get through all this (there are probably smarter and faster ways to
 calculate this but i am not too deep into R and bioinformatics, and i 
need
 to calculate this only once). could the 3 for-loops already be too

What is this. Please try to set up small example with what do you have and 
what do you want to achive. Unless you can explain better what do you 
want, you probably will not get better answer. 

I, however, may be proven wrong as some clever people in this list are far 
better in mind reading then I am :-)

Regards
Petr


 computationally intense for adding matrix operations?
 
 any help would be much appreciated!
 
 thx, frido
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/help-please-
 matrix-operations-inside-3-nested-loops-tp4639592.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help to program my function

2012-08-08 Thread Petr PIKAL
Hi

please cc your post also to Rhelp list, others may give you better/quicker 
answer.

 
 
 HI peter
 there is the function  that i want to programm (joint in pdf folder).

No pdf allowed.

 my data is  dataexpv  Ti1  265.792  26 1579.523  26 
2323.704 
 28   68.855  28  426.076  28  110.297  28  108.298  28 1067.609  30 17.
 0510 30   22.6611 30   21.0212 30  175.8813 30  139.0714 30  144.1215 30 
 
 20.4616 30   43.4017 30  194.9018 30   47.3019 307.7420 320.4021 

 32   82.8522 329.8823 32   89.2924 32  215.1025 321.7526 32 0.
 7927 32   15.9328 323.9129 320.2730 320.6931 32  100.5832 32 
 
 27.8033 32   13.9534 32   53.2435 340.9636 344.1537 340.1938 

 340.7839 348.0140 34   31.7541 347.3542 346.5043 34 8.
 2744 34   33.9145 34   32.5246 343.1647 344.8548 342.7849 34 
 
 4.6750 341.3151 34   12.0652 34   36.7153 34   72.8954 361.9755 
36
 0.5956 362.5857 361.6958 362.7159 36   25.5060 360.3561 
36
 0.9962 363.9963 363.6764 362.0765 360.9666 365.3567 
36
 2.9068 36   13.7769 380.4770 380.7371 381.4072 380.7473 
38
 0.3974 38  !
1.1375 380.0976 382.38

Use dput(your.data) and copy it to mail. It is directly redable by R and 
not scrambled like above.

 NB X-Ti

 thanks for helping mehafida

Not at all. I did not help much yet.

Regards
Petr

 
 Date: Wed, 8 Aug 2012 03:13:28 -0700
 From: ml-node+s789695n463956...@n4.nabble.com
 To: hafida...@hotmail.fr
 Subject: Re: help to program my function
 
 
 
Hi
 
 
 Maybe it is time for you to read some basic stuff like R intro. It seems 

 
 to me that you expect R to behave like some other language you know but 
 
 probably your expectation is wrong.
 
 
 See inline
 
 
  
 
  HI
 
  
 
  i have a problem please help me to solve it: 
 
  http://r.789695.n4.nabble.com/file/n4639434/aj.pdf aj.pdf 
 
  
 
  i want to calculate the vecteur a[j] where j: 1...8
 
  
 
  this is the code in R:
 
  
 
  aj.fun - function(j, i, X, z, E, beta0, beta1){
 
  + n - length(X)
 
  + iX - order(X)
 
  + iz - order(z)
 
  + e1 - -(beta)*z[ iz[1:(i - 1)] ]
 
 where do you get beta
 
 
  + numer - E[j] - sum( X[ iX[1:(i - 1)] ] * exp(e1) )
 
  + e2 - -(beta)*z[ iz[i:n] ]
 
  + denom - sum( exp(e2) )
 
  + numer/denom
 
  + }
 
  
 
   iX-order(X)
 
   iX
 
   [1] 75 37 29 60 73 20 69 55 30 70 72 38 26 35 65 61 74 50 71 57 25 54 

 
 64 76
 
  56
 
  [26] 58 48 67 46 63 28 62 36 49 47 66  1 42 41 19 39 43 22 51 68 33 27 
9 
 
 15
 
  11
 
  [51] 10 59 32 40 45 44 52 16 18 34  4 53 21 23 31  7  6 13 14 12 17 24 
5 
 
  8 
 
  2
 
  [76]  3
 
  
 
   iZ-order(Z)
 
   iZ
 
   [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 

 
 23 24
 
  25
 
  [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 

 
 48 49
 
  50
 
  [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 

 
 73 74
 
  75
 
  [76] 76
 
  
 
  e1 - -(beta)*Z[ iZ[1:(i - 1)] ]
 
  Warning message:
 
  In 1:(i - 1) : numerical expression has 76 elements: only the first 
used
 
 
 As somebody already mentioned i is probably vector and in this case only 

 
 first value is taken. i seems to have the firs value 3.
 
 
   e1
 
  [1]  -442 -1664
 
  
 
   numer - E[j] - sum( X[ iX[1:(i - 1)] ] * exp(e1))
 
  Warning message:
 
  In 1:(i - 1) : numerical expression has 76 elements: only the first 
used
 
   numer
 
  [1] 9.5 9.5 9.5 9.5 9.5 9.5 9.5 9.5
 
 
 Here j is vector of 8 values therefore 8 values
 
 
  
 
   e2 - -(beta)*Z[ iZ[i:n] ]
 
  Warning message:
 
  In i:n : numerical expression has 76 elements: only the first used
 
   e2
 
   [1]  -442 -1664  -442 -1792  -476 -1792  -476 -1792  -510 -1920  -510 

 
 -1920
 
  [13]  -510 -1920  -510 -1920  -510 -1920  -510 -2048  -544 -2048  -544 

 
 -2048
 
  [25]  -544 -2048  -544 -2048  -544 -2048  -544 -2048  -544 -2048  -578 

 
 -2176
 
  [37]  -578 -2176  -578 -2176  -578 -2176  -578 -2176  -578 -2176  -578 

 
 -2176
 
  [49]  -578 -2176  -578 -2176  -578 -2304  -612 -2304  -612 -2304  -612 

 
 -2304
 
  [61]  -612 -2304  -612 -2304  -612 -2304  -612 -2304  -646 -2432  -646 

 
 -2432
 
  [73]  -646 -2432  -646 -2432
 
 
 Strange, here first value of i seems to be 1 as n shall be 76 and final 
e2 
 
 length is 76. 
 
 
   denom - sum( exp(e2) )
 
 
 
   numer/denom
 
  [1] 4.313746e+192 4.313746e+192 4.313746e+192 4.313746e+192 
 
 4.313746e+192
 
  [6] 4.313746e+192 4.313746e+192 4.313746e+192
 
  
 
  my problem that the vecteur a[j] could not have the same number!!!
 
 
 I do not understand. Your numer is 9.5 repeted 8 times. If you divide it 

 
 by one number you will get nine times the same number.
 
 
 You send us a code but no data so it is difficult to understand what is 
 
 your goal. It would be better to send input data
 
 
 j, i, X, z, E, beta0, beta1
 
 
 and assumed result in whole not in chunks scattered in several mails.
 
 
 Regards
 
 

Re: [R] How to convert data to 'normal' if they are in the form of standard scientific notations?

2012-08-07 Thread Petr PIKAL
Hi

 
 Dear Jean
 
 Thanks a lot for your help.
 
 The reason I did not provide producible code is that my work started 
with
 reading in some large csv files, e.g. the data is not created by myself.
  But the data is from the same data provider so I would expect to 
receive
 data in exactly same data format.
 
 
 I use read.csv to read the data in. My major curious is that by using
 exactly same code as I provided in my email, e.g. 'as.factor' why one of
 them work (e.g. convert the numerical data to factor) but the other  one
 remains numerical with scientific notation?  So, in R, how do I check if
 the data format are different for these two files in their original csv
 files, which  might cause the different results..?
 
 Also I tried your code and created some reproducible examples, but still
 can not make it work as in your example

a-c(2.0e+9,2.1e+9)
is.numeric(a)
[1] TRUE
a.f-factor(a)
is.numeric(a.f)
[1] FALSE
is.factor(a.f)
[1] TRUE

so factor comes correctly

print(a,digits=12)
[1] 2.0e+09 2.1e+09

print(a, digits=21)
[1] 20.000 21.000

b-c(3000,3100)
print(b,digits=5)
[1] 3.0e+07 3.1e+07

So the printed result depends probably on your local setting.
See also ?options help page. And maybe also ?format and ?sprintf

Regards
Petr


 
 
  a-c(2.0e+9,2.1e+9) print(a,digits=4)[1] 20 21  # I 
 expected to see 2.0e+9 here...? print(a,digits=7)[1] 20 
 21  # Think here I should expect same 2.0e+9? 
getOption(digits)
 # Checking my default number of digits now..[1] 7 b-c(3000,
 3100) print(b)[1] 3000 3100   # This is what I expected 

 to see print(b,digits=5)[1] 3000 3100   # I'm so confused why 
it 
 is not working, e.g. printing 3.0e+9! getOption(digits)   # checking 
 again, but now I would expect it has being changed to 5[1] 7
 
 
 Any thoughts please...?
 
 Thanks
 HJ
 
 
 On Mon, Aug 6, 2012 at 7:04 PM, Jean V Adams jvad...@usgs.gov wrote:
 
  HJ,
 
  You don't provide any reproducible code, so I had to make up my own.
 
  dat - data.frame(a=letters[1:5], x=c(20110911001084, 20110911001084,
  20110911001084, 20110911001084, 20110911001084),
  y=c(2.10004e+12, 2.10004e+12, 2.10004e+12, 2.10004e+12,
  2.10004e+12))
 
  In my example, the long numbers print out without scientific notation.
 
  dat
a  x y
  1 a 20110911001084 210004000
  2 b 20110911001084 210004000
  3 c 20110911001084 210004000
  4 d 20110911001084 210004000
  5 e 20110911001084 210004000
 
  I can make it print with scientific notation using the digits argument 
to
  the print() function.
 
  print(dat, digits=3)
ax   y
  1 a 2.01e+13 2.1e+12
  2 b 2.01e+13 2.1e+12
  3 c 2.01e+13 2.1e+12
  4 d 2.01e+13 2.1e+12
  5 e 2.01e+13 2.1e+12
 
  What is your default number of digits?
  getOption(digits)
 
  Jean
 
 
  HJ YAN yhj...@googlemail.com wrote on 08/06/2012 11:14:17 AM:
 
  
   Dear R users
  
   I read two csv data files into R and  called them Tem1 and Tem5.
  
   For the first column, data in Tem1 has 13 digits where in Tem5 there 
are
  14
   digits for each observation.
  
   Originally there are 'numerical' as can be seen in my code below. 
But
  how
   can I display/convert them using other form rather than scientific
   notations which seems a standard/default?
  
I want them to be in the form like '20110911001084', but I'm very
  confused
   why when I used 'as.factor' call it works for my 'Tem1' but not for
   'Tem5'...??
  
  
   Many thanks!
  
   HJ
  
Tem1[1:5,1][1] 2.10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 2.
   10004e+12 Tem5[1:5,1][1] 2.011091e+13 2.011091e+13 2.011091e+13 2.
 
   011091e+13 2.011091e+13 class(Tem1[1:5,1])[1] numeric class(Tem5
   [1:5,1])[1] numeric as.factor(Tem1[1:5,1])[1] 2.10004e+12 2.
   10004e+12 2.10004e+12 2.10004e+12 2.10004e+12
   Levels: 2.10004e+12 as.factor(Tem5[1:5,1])[1] 20110911001084
   20110911001084 20110911001084 20110911001084 20110911001084
   Levels: 20110911001084
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] process evaluation packages (slightly off topic)

2012-08-07 Thread Petr PIKAL
Dear all

I need to perform some process evaluation. Sorry for not posting data and 
code - I do not have any, I ask only for pointing me to correct direction.

Suppose I have several connected processes P1, P2, ..., Pn. Each process 
takes some time and have some capacity (let say like preparing a dinner 
for several persons - only one stove, limited capacity of utensils, 
heating and cooling takes some time) and some processes can by cyclic (fry 
onion in pan, put it aside, in the same pan fry meat, put an onion and 
some water and simmer for a while...). 

I can prepare some oriented graph (paper/pencil) or Word or drawing 
programme, I can also evaluate whole process by shading spreadsheet cells 
but those two tasks are not connected.

Is there any R package/other software suitable for simplifying or helping 
in such tasks? E.g. When I prepare oriented graph with capacity and time 
for each node is there any automatic way to transfer this graph to 
timeline to see how long whole process will take, where are bottlenecks or 
so?

Thank you

Petr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Problem with segmented function

2012-08-06 Thread Petr PIKAL
Hi
 
 Hi,
 
 I appreciate your help with the segmented function. I am relatively new 
to
 R. I followed the introduction of the 'segmented'-package by Vito 
Muggeo,
 but still it does not work.
 Here are the lines I wrote:
 
 data_test-data.frame(x=c(1:10),y=c(1,1,1,1,1,2,3,4,5,6))
 lr_test-lm(y~x,data_test)
 seg_test-segmented(lr_test,seg.Z~x,psi=1)

You did not read help page correctly. seg.Z is named parameter in which 
you specify formula without LHS. psi shall be x near the expected slope 
change.

seg_test-segmented(lr_test,seg.Z=~x,psi=5)

works corretly

Regards
Petr

 
 
 /error in segmented.lm(lr_test, seg.Z ~ x, psi = 1) : 
  A wrong number of terms in `seg.Z' or `psi'/
 
 Thank you very much,
 Stella
 
 
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-
 segmented-function-tp4639227.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to identify values from a column of a dataframe, and insert them in other data.frame with the corresponding id?

2012-08-06 Thread Petr PIKAL
Hi

It is better to use dput for presenting data for others. You probably want 
?merge.

Something like

merge(datuak, datuak2, by = calee_id, all.x=TRUE)

However calee_id seems to be a floating point number and it may be rounded 
so you shall beware of it. 

Regards
Petr

 
 Thank you very much John, can you read it now?
 
 Hello,
 
 I'd like to do next, see if you could help me please:
 I have a csv called datuak with a id called calee_id and a colunm
 called poids.
 I have another csv called datuak2 with the same id called calee_id,
 (although there are calee_id that are in datuak but not in datuak2
 and inverse), and a column called kg_totales in which the values are
 repeteated for each calee_id because are the sum of the colum kg for
 each row.
 
 I show you the table datuak and datuak2:
 
 Datuak (in the example the calee_id is the same, but there are a lot):
 
poids   calee_id   maree_id
10   1.27E+12   0.3013157
20   1.27E+12   0.05726046
20   1.27E+12   0.73631699
25   1.27E+12   0.74492002
3   1.27E+12   0.74492002
27   1.27E+12   0.31776439
43   1.27E+12   0.31776439
 
 
 Datuak2:
 
calee_id  maree_id  kg_totales  effectif
 1 1.33959e+12 0.782835873  129.7 30
 2 1.33959e+12 0.782835873  129.7 40
 3 1.33959e+12 0.782835873  129.7 10
 4 1.33959e+12 0.782835873  129.7  5
 5 1.33959e+12 0.782835873  129.71.7
 6 1.33959e+12 0.782835873  129.7 20
 7 1.33959e+12 0.782835873  129.7 20
 8 1.33959e+12 0.782835873  129.7  1
 9 1.33959e+12 0.782835873  129.7  2
 
 I would like to identify in the csv datuak2 the corresponding
 calee_id that also are in datuak, and create a new column in
 datuak with the values for each calee_id from kg_totales, and not
 repeat them.
 So the final table would be datuak, with calee_id, poids, and the
 new column kg_totales with its corresponding value for each row.
 
 Thank you very much,
 Nerea
 
 -Mensaje original-
 De: John Kane [mailto:jrkrid...@inbox.com] 
 Enviado el: 03 August 2012 20:17
 Para: Nerea Lezama; r-help@r-project.org
 Asunto: RE: [R] how to identify values from a column of a dataframe, and
 insert them in other data.frame with the corresponding id?
 
 Hi Nerea,
 
 For some reason your post is badl garbled and close to imposible to
 read. 
 Perhaps you need to check your text encoding?
 
 Also to send sample data it is better to use the dput() command.
 Do dput(myfile) and then paste the results into your email
 
 Sorry not to be of more help.
 
 John Kane
 Kingston ON Canada
 
 
  -Original Message-
  From: nlez...@azti.es
  Sent: Fri, 3 Aug 2012 12:34:07 +0200
  To: r-help@r-project.org
  Subject: [R] how to identify values from a column of a dataframe, and 
  insert them in other data.frame with the corresponding id?
  
  
  
  Hello,
  
  Ib??d like to do next, see if you could help me please:
  I have a csv called b??datuakb?? with a id called b??calee_idb?? and a
 
  colunm called b??poidsb??.
  
  I have another csv called b??datuak2b?? with the same id called 
  b??calee_idb??, (although there are b??calee_idb?? that are in 
  b??datuakb?? but not in b??datuak2b?? and inverse), and a column 
  called b??kg_totalesb?? in which the values are repeteated for each 
  calee_id because are the sum of the colum b??kgb?? for each row.
  
  
  
  I show you the table b??datuakb?? and b??datuak2b??:
  
  
  
  Datuak (in the example the calee_id is the same, but there are a lot):
  
  
  
 poids
  
  calee_id
  
  maree_id
  
 10
  
  1.27E+12
  
  0.3013157
  
 20
  
  1.27E+12
  
  0.05726046
  
 20
  
  1.27E+12
  
  0.73631699
  
 25
  
  1.27E+12
  
  0.74492002
  
 3
  
  1.27E+12
  
  0.74492002
  
 27
  
  1.27E+12
  
  0.31776439
  
 43
  
  1.27E+12
  
  0.31776439
  
  
  
  
  
  Datuak2:
  
  
  
 calee_id  maree_id  kg_totales  effectif
  
  1 1.33959e+12 0.782835873  129.7 30
  
  2 1.33959e+12 0.782835873  129.7 40
  
  3 1.33959e+12 0.782835873  129.7 10
  
  4 1.33959e+12 0.782835873  129.7  5
  
  5 1.33959e+12 0.782835873  129.71.7
  
  6 1.33959e+12 0.782835873  129.7 20
  
  7 1.33959e+12 0.782835873  129.7 20
  
  8 1.33959e+12 0.782835873  129.7  1
  
  9 1.33959e+12 0.782835873  129.7  2
  
  I would like to identify in the csv b??datuak2b?? the corresponding 
  b??calee_idb?? that also are in b??datuakb??, and create a new column 
  in b??datuakb?? with the values for each b??calee_idb?? from 
  b??kg_totalesb??, and not repeat them.
  
  So the final table would be b??datuakb??, with b??calee_idb??, 
  b??poidsb??, and the new column b??kg_totalesb?? 

Re: [R] Help on merging without a common variable

2012-08-03 Thread Petr PIKAL
Hi

OTOH I wonder why cbind gives error as OP told us

x - data.frame(x = 1:5)
y - data.frame(y = 6:15)

merge(x,y)
cbind(x,y)

Gives different results but without any error.

Regards
Petr

 
 On Thu, Aug 2, 2012 at 4:52 PM, Ayyappa Chaturvedula
 ayyapp...@gmail.com wrote:
  Michael,
  Thank  you for this , it worked.  I was thinking by is a required 
argument
  in merge function.
 
 
 Well, it is required in a strict sense, but it has a default value
 (defaulting to the shared names) so _you_ don't have to specify it.
 
 Best,
 Michael
 
  Regards,
  Ayyappa
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum two Vectors of different length

2012-08-03 Thread Petr PIKAL
Hi

Your description is quite long but almost uninformative about what you 
really want. 

You do not say which values you want to sum but you say it is completely 
equal which value you want to add to what and what shall be the final 
vector length

Based on this I would just use simple +

x-1:10
y-1:9
x+y
 [1]  2  4  6  8 10 12 14 16 18 11
Warning message:
In x + y : longer object length is not a multiple of shorter object length

warning message is not an error and tells you that the shorter vector is 
recycled e.g. 1,2 or 3 elements are used twice for the calculation. 
Therefore the last value is 11 which is 10 from x vector + 1 from y 
vector.

If you have some other constrains and failed to tell us please do it to 
get some better suited answer.

Regards
Petr

 
 Dear all,
 in one part of my code I want to sum two vectors element-wise
 the problem is that either the 1st vector or the 2nd vector always have 
 one or two less elements
 
 example of my problem
 
 In TotalVector + 
(datalist2[[1]]$dataset$Results[[j]]$Results[[time]]$Sweep) :
   Länge des längeren Objektes
  ist kein Vielfaches der Länge des kürzeren Objektes
 Browse[1] str(TotalVector)
  int [1:10308] 3032 3048 3075 2978 3026 3012 2933 2987 3063 3038 ...
 Browse[1] 
str(datalist2[[1]]$dataset$Results[[j]]$Results[[time]]$Sweep)
  int [1:10307] 2 1 3 1 5 6 3 1 0 2 ...
 
 
 as you can see the two vectors differ only in one element. As the sample 

 is quite large it would be the same if I ignore the one extra element.
 There are times though that the missing elements can be 2 or 3 (but 
always
 the number is small enough so to be ignored)
 
 
 The major concern is that this difference can be either on the fist 
 vector or either on the second vector. If I try to solve that with 
simple 
 if statements the code gets too much of spaghetti... Is there a simple 
way
 when there is this length difference 
 
 
 either to 
 
 
 a. Ignore the extra elements
 -or-
 
 b. Add the elements missing to the vector with the smaller length( one 
can
 just duplicate some of the existing values to reach the needed length)
 
 How I can do either a or b?
 
 I would like to thank you in advance for your help
 
 Regards
 Alex
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Chopping a vector up into smaller vectors

2012-08-02 Thread Petr PIKAL
Hi

one of possible options

f- function(x, parts) split(x,rep(1:length(parts),parts))
f(9:1, c(3,2,4))
$`1`
[1] 9 8 7

$`2`
[1] 6 5

$`3`
[1] 4 3 2 1

You can also check if your parts vector agrees with x vector, if you want.

Regards
Petr


 
 Anyone got a neat way to chop a vector up into smaller subvectors?
 This is what I have now, which seems inelegant:
 
 chop - function(v, counts) {
   stopifnot(sum(counts)==length(v))
   end - cumsum(counts)
   beg - c(1, 1+end[-length(end)])
   begend - cbind(beg, end)
   apply(begend, 1, function(x) v[x[1]:x[2]])
 }
 
 
  chop(9:1, c(3,2,4))
 [[1]]
 [1] 9 8 7
 
 [[2]]
 [1] 6 5
 
 [[3]]
 [1] 4 3 2 1
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to calculate seasonal mean for temperatures

2012-08-01 Thread Petr PIKAL
Hi

Something like

aggregate(DF$data, list(quarters(DF$date), format(DF$date, %Y)), mean)

Regards
Petr

 
 Hello everybody,
 
 I need to calculate seasonal means with temperature data for my work. 
 I have 70 files coming from weather stations, which looks like this for
 example:
 
 startdate - as.POSIXct(01/01/2006, format = %d/%m/%Y)
 enddate - as.POSIXct(05/01/2006, format = %d/%m/%Y)
 date - seq(from = startdate, to = enddate, by = days,format = 
%d/%m/%Y)
 
 DF - data.frame(data=c(2.5,1.4,3.6,0.5,-1.2),date=date)
 
 With this daily data, I need to calculate seasonal means.
 I mean for season: winter (January,February,March) ; Spring 
(April,May,June)
 ; Summer(July,August,September) and Autumn(October,November,December).
 
 My main problem is that all my files starts and ends not the same year 
(some
 of them starts 1st January 2006 and ends 31th december 2008, some of 
them
 starts 1st January 2007 and ends 31th December 2011, ...).
 
 So not the same year, but all of them starts a 1st January and ends a 
31th
 December.
 
 I'd like first to delete (or ignore) all the first 2 months (January and
 February) and the last month (December) of all my files, because I 
cannot
 calculate a seasonal means for them (not all the 3 months).
 But the problem for the first 2 months is for leap yars (with 29th
 February). For example, if my file starts in 2008, the first 2 months 
will
 not be the same length as files starting in 2007 or 2006. So I cannot 
just
 delete the first lines of my files because there'll be a problem for 
these
 leap years.
 And then, I'd like to calculate my seasonal means on each 3 months (like 
I
 showed you before).
 For example, my object seasonal means should look like this: Spring 
2006:
 xx ; Summer 2006: xx, ... (with xx my seasonal means).
 
 Have you any idea how to do this? I found functions such like xts() 
but it
 need to specify a year, so in my case it couldn't work. I need to 
automatize
 this for all my files, so it shouldn't depend on the start year.
 Thanks a lot! 
 
 
 
 
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/how-to-
 calculate-seasonal-mean-for-temperatures-tp4638639.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] repeating a function across a data frame

2012-08-01 Thread Petr PIKAL
Hi

did you find function dist? It seems that it can do directly what you 
want.

Regards
Petr

 
 Hello everyone.  Like others on this list, I'm new to R, and really not 
much
 of a programmer, so please excuse any obtuse questions!  I'm trying to
 repeat a function across all possible combinations of vectors in a data
 frame.  I'd hugely appreciate any advice!
 
 Here's what I'm doing:
 
 I have some data: 40 samples, ~460 000 different readings between 1 and 
0
 for each sample.  I would like to make R spit out a matrix of distances
 between the samples.  So far, I have made a function to calculate the
 distance between any two samples:
 
 DistanceCalc-function(x,y){#x and y are both vectors - the entire 
reading
 set for sample x and
 #sample y, respectively
   distance-sqrt(sum((x-y).^2))
   distanceCorrected-distance/sqrt(length(x))#to force the maximum 
possible
 value to =1
   print(distanceCorrected)
 }
 
 The next thing I want to do is to make this function run to compare all
 possible combinations of my samples (1vs1, 1vs2, 1vs3...2vs1, 2vs2 etc). 
 In
 python, the only other programming language I have ever used, I would 
just
 use a for loop.  I have asked the internet how to do this, but the
 overwhelming response seems to be you don't want to do it like that - 
use
 the 'apply' functions.  I've tried to use the apply functions, but I 
tend
 to find that I can only give my DistanceCalc function a single vector (I 
can
 tell it where to find x, but not where to find y, or vice versa).  I've 
also
 found the 'by' and the 'outer' functions, but I'm likewise failing at 
making
 those work, e.g.
 
  
distancetable-outer(DataWithoutBlanks,DataWithoutBlanks,FUN=DistanceCalc)
 Error in x - y : non-numeric argument to binary operator
 
 I think this may be because my data has headers and the function is 
trying
 to calculate the difference between the names of my samples, but I don't
 know how to correct this.
 
 Would really appreciate your help!
 
 Jen
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/repeating-a-
 function-across-a-data-frame-tp4638643.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem loading siar

2012-07-04 Thread Petr PIKAL
Hi
 
 Hello,
 
 It is not what happens.
 
 Function convexhull exists in both siar and spatstat packages. As 
 you already load spatstat, when you are loading siar, the 
 convexhull in spatstat is masked by the one in siar.
 
 Thus, when you will run convexhull function, it will be the one from 
 the siar package.

AFAIK there is an option to use both functions, you just need to specify 
from which package you want it.

I do not use it but I believe there is a mention in docs and it was 
discussed before in help list too.

probably
spatstat:::convexhull()

Regards
Petr


 
 Regards
 
 
 Le 04/07/2012 15:24, Sukran yalcin ozdilek a écrit :
  Hi,
 
I have a problem while loading the siar program in R.
 
 
 
  When I am loading siar, system does not load convexhull. On the screen 
I
  have seen such writings.
 
 
 
  The following object(s) are masked from ‘package:spatstat’:
 
   convexhull
 
 
 
  How can I load the convexhull, how can I unmask from this package? I 
will
  be appreciated if you give advice about this.
 
 
  Best
  Sukran
 
 [[alternative HTML version deleted]]
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: legend

2012-07-04 Thread Petr PIKAL
Hi

I do not have direct answer. You shall probably search ggplot2 web. 
Searching legend gave me about sixty results from which you probably 
could learn how to modify legend(s) according to your wish.

e.g.
http://had.co.nz/ggplot2/docs/opts.html

Regards
Petr

 
 Dear all,
 
 I produced the following graph with ggplot which is almost fine, yet I 
 don't like that the legend for Means and Observations includes a 
line,
 though no line is used in the plot for those two (the line for Overall 
 Mean on the other hand is wanted):
 
 library(ggplot2)
 ddf - data.frame(x = factor(rep(LETTERS[1:2], 5)), y = rnorm(10)) 
 p - ggplot(ddf, aes(x = x, y = y))
 p + geom_point(aes(colour=Observations, shape=Observations)) +
 stat_summary(aes(colour = Means, shape =Means), 
  fun.y = mean, fun.ymin = min, fun.ymax = max, 
  geom = point) +
 geom_hline(aes(yintercept = mean(y), linetype = Overall Mean), 
 show_guide = TRUE)
 
 I tried to map the linetype in geom_point (and stat_summary) to NULL and 
I
 tried to set it to blank but neither worked.
 
 Is it furthermore possible to combine the two legends? My preferred plot 

 would have two symbols with different colours and shapes for Means and 

 Observations (but no line) and directly below that a line for Overall 

 Mean (that is all the three items in one legend)? For that I tried to 
 assign the same names to the legends but this did not work either.
 
 So any help would be highly appreciated.
 
 Kind Regards,
 
 Thorn Thaler
 Mathematician
 
 Applied Mathematics 
 Nestec Ltd,
 Nestlé Research Center
 PO Box 44 
 CH-1000 Lausanne 26
 Phone: +41 21 785 8220
 Fax: +41 21 785 9486
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A challenging question: merging excel files under a specific pattern

2012-07-04 Thread Petr PIKAL
Hi

Well, this is help list for R not for Excel, maybe you shall contact 
Microsoft guys. I believe that probably easiest would be to make a simple 
macro in Excel.

If you want to do merging in R you shall go through help pages for

read.xls, merge, cbind, rbind and R data import/export manual.

Regards
Petr

 
 Dear all,
 
 I have an excel file that contains 6 sheets
 
 1,2,3,4,5,6
 
 The analysis is repeated every 3 sheets
 
 Sheets 1, 2, 3:
 
 I want to add (horizontally) the data contained in the matrix : sheet2
 (5:end,3:end )
 
 of *Sheet2 * to the sheet3 such that the first element of the matrix
 *sheet2 (5:end,3:end ) *
 
 to occupy the location/cell sheet3(5,end+1 ) of sheet3.
 
 Say, that the output from this merging is sheetA. Then I want to add
 horizontally the data contained in the matrix : Sheet1 (5:end,3:end )
 of Sheet 1 to the sheetA * such that the first element of the matrix
 *sheet1 (5: end,3:end ) to occupy the location/cell sheetA(5,end+1 )
 of sheetA.
 
 As you can see
 
 1)I add sheet2 (5:end,3:end ) * to *sheet3 at location *sheet3(5,end+1) 
*
 
 2) then I add sheet1 (5:end,3:end ) to the output sheetA that results
 from the merging of sheets 2 and 3 at location sheetA((5,end+1).
 
 3) The output is named ,say, sheetAA
 
 Similarly analysis holds for the other block of sheets 4,5,6. That is,
 
 Sheets 4, 5, 6:
 
 1)I add sheet5 (5:end,3:end ) to sheet6 at location sheet6(5,end+1)
 
 2) then I add sheet4 (5:end,3:end ) to the output sheetB that results
 from the merging of sheets 5 and 6 at location sheetB((5,end+1).
 
 3) The output is named, say, sheetBB
 
 In my case I have a large sequence of sheets that I have to merge in
 this way. That is,
 
 1,2,3,4,5,6,7,8,9,10,11,12,13,…
 
 But the logic is the same as described above.
 
 Is there any “easy” way to do that kind of merging? . Because when you
 have 13*3 =39 sheets is a bit tedious to do that merging manually.
 
 thanks
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset data based on values in multiple columns

2012-07-03 Thread Petr PIKAL
Hi

 
 Dear list members,
 
 I am trying to create a subset of a data frame based on conditions in 
two 
 columns, and after spending much time trying (and search R-help) have 
not 
 had any luck. Essentially, I have a data frame that is something like 
this:
 
 date-as.POSIXct(as.character(c
 
(2012-01-25,2012-01-25,2012-01-26,2012-01-27,2012-01-27,2012-01-27)))
 time-as.POSIXct(as.character(c(13:20, 13:40, 14:00, 10:00, 10:
 20, 10:20)), format=%H:%M)
 count-c(12,14,11,12,12,8)
 data-data.frame(date,time,count)
 
 which looks like:
 
 date time  count
 1 2012-01-2513:20:00 12
 2 2012-01-2513:40:00 14
 3 2012-01-2614:00:00 11
 4 2012-01-2710:00:00 12
 5 2012-01-2710:20:00 12
 6 2012-01-2710:20:00  8
 
 I would like to create a subset by doing the following: for each unique 
 date, only include one case which will be the case with the max value 
for 
 the column labelled count. So the resulting subset would be:
 
 date time  count
 2 2012-01-2513:40:00 14
 3 2012-01-2614:00:00 11
 4 2012-01-2710:00:00 12
 
 Some dates have two cases at which the count was the same, but I only 
 want to include one case (I don't really mind which case it chooses, but 

 if need be it could be based on the earliest time for which the same 
 counts occurred). I have tried various loops with no success! I'm sure 
 that there is an easy answer that I have not found! Any help is much 
appreciated!!

Just few days ago similarquestion was asked (selecting rows by maximum 
value of one variables in dataframe nested by another Variable). Here is 
what was recommended.

do.call(rbind,lapply(split(data, data$date), function(x) 
x[which.max(x[,2]),]))

Regards
Petr


 
 All the best,
 
 Chandra
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to remove this loop? [SEC=UNCLASSIFIED]

2012-07-03 Thread Petr PIKAL
Hi

 Hi all,
 
 I would like create a new column in a data.frame (a1) to store 0, 1 data 

 converted from a factor as below.
 
 a1$h2-NULL
 for (i in 1:dim(a1)[1]) {
   if (a1$h1[i]==H) a1$h2[i]-1 else a1$h2[i]-0
   }
 
 My question: is it possible to remove the loop from above code to 
achieve 
 the desired result?

Untested

a1$h2 - (a1$h1==H)*1

Regards
Petr

 
 Thanks in advance,
 Jin
 
 Geoscience Australia Disclaimer: This e-mail (and files transmitted with 

 it) is intended only for the person or entity to which it is addressed. 
If
 you are not the intended recipient, then you have received this e-mail 
by 
 mistake and any use, dissemination, forwarding, printing or copying of 
 this e-mail and its file attachments is prohibited. The security of 
emails
 transmitted cannot be guaranteed; by forwarding or replying to this 
email,
 you acknowledge and accept these risks.
 
-
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About Error message

2012-07-02 Thread Petr PIKAL
 
 Hi again!
 I have a question about R.
 I have done gam in previous version of R with mgcv package and saved 
the
 workspace. This workspace contains different models and I will do 
prediction
 by these GAMs.
 
 However, I install new version of R. and use the same workspace. when I 
type
 summary(models), and the error message showed
 Error in Predict.matrix.cr.smooth(object, dk$data) :  F is missing from 
cr
 smooth - refit model with current mgcv.
 
 this workspace is normal when I used previous version of R. What's 
wrong?!

Hi

Maybe in new installation some packages are missing (not installed). Try 
to install all packages you used during your previous work and then try to 
start R again.

Regards
Petr

 Thank in advance.
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/About-Error-
 message-tp4634955.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] geom_boxplot

2012-07-02 Thread Petr PIKAL
Hi

In new ggplot2 version following works too

p + geom_boxplot(aes(fill = factor(cyl))) +
   labs(fill = Cylinders) + ylab(Miles per Gallon)+xlab(Number of 
Cylinders)

Regards
Petr

 
 Yes you can do all of the things you want. 
 
 Below is a start, to give you an idea of how to approach some of it.
 
 library(ggplot2)
 p - ggplot(mtcars, aes(factor(cyl), mpg))
  p  -  p + geom_boxplot(aes(fill = factor(cyl))) +
   labs(fill = Cylinders)  +
   scale_y_continuous(Miles per Gallon) +
   scale_x_discrete(Number of Cylinders)
 p
 
 
 Have a look at 
ackoverflow.com/questions/3606697/how-to-set-x-axis-limits-
 in-ggplot2-r-plots for x and y axes limits. 
 
 It took me a while to realise it but, generally, I find that it is not 
too
 hard to find examples of what you need by just googling something like 
 :ggplot2 set x and y limits  or ggplot2 geom_bar colour and so on. 
 
 The ggplot2 and geom_XXX are pretty unique on the internet and search 
 results usually are not too bad. 
 
 You may also want to subcribe to the ggplot2 group on google groups.
 
 Best wishes
 
 
 John Kane
 Kingston ON Canada
 
 
  -Original Message-
  From: hannah@gmail.com
  Sent: Sun, 1 Jul 2012 08:39:20 -0400
  To: r-help@r-project.org
  Subject: Re: [R] geom_boxplot
  
  Also, it is possible to change ylim also?
  
  2012/7/1 li li hannah@gmail.com
  
  Dear all,
I have a few questions regarding the boxplot output from the
  geom_boxplot function.
  Attached is the output I get. Below are my questions:
  
1. How can I define the xlab and ylab myself?
   Also I would like to remove factor(variable)
  line on the right side.
  
2. How can I define the colors of the boxplots myself.
   For example, I want to use blue for
  LR, green for pair and purple for BR1.
Thanks so much!
  Hannah
  
  
 [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 FREE ONLINE PHOTOSHARING - Share your photos online with your friends 
and family!
 Visit http://www.inbox.com/photosharing to find out more!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R sub query

2012-07-02 Thread Petr PIKAL
Hi

I am not at all an expert in regular expressions but

gsub(^[[:punct:]]+[[:digit:]]+:, ,m)

does the output you want. Maybe by chance :-)

Regards
Petr

 
 Hello,
 I would like to substitute a substring of characters defined by a 
specific
 start and end sequence. 
 i.e. in the example matrix below, I would like to substitute .:X: with 

 , where X varies in sequence...
  
 m-matrix(c(.:0:0,0, .:2:0,2, .:194:193,1, .:56:0,56, 
.:58:50,8,
 .:13:0,13,  .:114:114,0, .:75:75,0), nrow=2)
  
 output required:
  [,1]  [,2]  [,3][,4] 
 [1,] 0,0 193,1 50,8 114,0
 [2,] 0,2 0,56   0,13 75,0 
  
 Thank you for any help
 Sarah
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help

2012-06-29 Thread Petr PIKAL
Hi

It seems to me that it can be done by ggplot2 package. However I do not 
understand what is three boxplots one on top of another? How could you see 
the bottom boxplot when it is twice overplotted?

library(ggplot2)
p - ggplot(mtcars, aes(factor(cyl), mpg)) 
p + geom_boxplot(aes(fill = factor(vs)))

Regards
Petr


 
 Dear all,
   I need some help on plotting multiple boxplots on one figure.
   I have three matrix A, B and C. Each of them is a 1000 by 10 matrix.
 The 10 columns of all three matrix correspond to the
 10 values of the same parameter, say k=1, ..., 10.
   I want to make a plot where x axis represents different values of k.
 For each k value, I want to plot three boxplots, one on top of another.
 For example, for k=1, I want to draw three boxplot based on the first
 column of A, B and C respectively. Similarly, I do the same for the rest 
of
 k values.
   Can some one give me some hint on this?
   Thank you so much.
 Hannah
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Reading a new dataset using R

2012-06-29 Thread Petr PIKAL
Hi

 
 Dear Sir,
 
 We are able to read dataset in the syn.data (species:yeast) using R.But 
we
 want to read a new dataset using R.
 We are not able to do that please tell us the procedure of reading a new
 dataset of a new species.

Did you try it by similar way as you read the first dataset? What went 
wrong?

Regards
Petr

 After reading the dataset we will use the  minet for generating the MI
 matrix of that new dataset.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to add the sample number in the hist figure

2012-06-29 Thread Petr PIKAL
Hi
 
 hi,R-users:
 Now I plot some data with the name(aveobsdata) in column , How can I add
 the
 some number(e,g. the sample number) in each of the column?
 plot
 
(aveobsdata,type='h',lwd=line_width,col=line_col,main=titleinfo,xlab=xxlab,ylab=yylab,xaxt
 = n)

We do not have aveobsdata hence we can not try your code. You probably 
shall try

?text

function
Something like

text(xccord, ycoord, some number, ...)

Regards
Petr


 axis(1, at = 1:nums, label = name)
 
 
 --
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Printing a variable in a loop

2012-06-28 Thread Petr PIKAL
Hi
 
 Thanks for your reply Jon. 
 
 I need to actually do more than print the name of the variable (I just 
made
 the example simpler). I need to manipulate var_1, var_2 etc. but setting
 values of NA to 0. 

Why? R has pretty strong system for handling NAs. The only exception AFAIK 
is cumsum
x-1:10
sum(x)
[1] 55
x[5]-NA
sum(x)
[1] NA
sum(x, na.rm=T)
[1] 50
cumsum(x)
 [1]  1  3  6 10 NA NA NA NA NA NA

 
  So as you said, 1. if you want to display the variable, just type it 

 var_1
 
 
 But how do I do this in a loop where the  number portion of var is the
 counter in the loop?

You probably shall get familiar with list concept which is another strong 
feature of R. You can easily subset lists either by *apply functions or in 
for cycle just by indexing.

 
 Thanks!
 Kat
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Printing-a-
 variable-in-a-loop-tp4634673p4634754.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adjusting length of series

2012-06-28 Thread Petr PIKAL
Hi

I use R for quite a long time and as I remember I did not use such assign 
paste i loop yet. Insted of such construct with polluting environment with 
plenty of objects named something(i)somethingelse it is always advisable 
to use lists.

When you want to shorten variables to some common length (by cutting some 
portion of it) you can do it easily by:

lll-list(a=1:10, b=1:9, c=1:8)
lll
$a
 [1]  1  2  3  4  5  6  7  8  9 10

$b
[1] 1 2 3 4 5 6 7 8 9

$c
[1] 1 2 3 4 5 6 7 8

shortest variable in lll
min(sapply(lll,length))
sapply(lll,[,1:min( min(sapply(lll,length
 a b c
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8

Regards
Petr


 
 Dear R Users,
  
 I ask the following question in order to learn more on the use of 
'assign'
 and 'paste' functions and for loop; otherwise what I am asking could be 
 solved by binding the various first differences of the series using the 
 'ts.union' operator.
  
 The problem is:
 I have several variables in my dataset, which I should model dynamically 
-
 i.e., with lags of differences of the time series in the regression 
 equation. Consequently, I used a loop (on which I got help from Sarah 
 Goslee) to difference them.
  
 Using the same variable as in my previous post, the first differences 
are 
 computed as follows:
  
  DCred1 - diff(Cred, difference=1) #call this the 
FIRST LOOP
  for(i in 2:5){
 +   print(assign(paste(DCred, i, sep=), diff(get(paste(DCred, i-1,
 + sep=)), difference=1)))
 + }
 
 NB:  I converted the series to time series using 'ts' before 
differencing.
  
 Now after obtaining first differences, I try to use the 'assign' and 
 'paste' function in two 'for loops' to adjust the lengths of lagged 
terms 
 (DCred1, DCred2, etc) to have the same length to be used in a regression 

 model. My code is:
  
 for(i in 1:3){   #call this the 
SECOND LOOP
 for(j in 3:1){
 print(assign(paste(Dcre, i, sep=), get(paste(DCred, i, 
sep=))[j:(136-i)]))
 }
 }
  
 NB: The length of the original series Cred (before differencing) equals 
 136. This is why the last term in the assign expression is (136 - i).
 NB: I run this loop after running the first loop which computes the 
first 
 differences, so that the 'get' operator obtains DCred1, DCred2, etc from 

 the results of the first loop.
  
 With this code, I expected to get DCred1 (whose length is 135) and 
 adjust its length to equal that of DCred3 (which is 133) and call the 
 result 'Dcre1'. Similarly, I intended to get DCred2 (whose length is 
134) 
 and adjust its length to 133 (the same as the length of DCred3) and call 

 it Dcre2. Lastly, I would get DCred3 of length 133 and call it Dcre3, 
with
 length 133. When I run this code, it runs succesfully. However, when I 
 then check the lengths of Dcre1, ..., Dcre3, I get:
  
  length(Dcre1)
 [1] 135
  length(Dcre2)
 [1] 134
  length(Dcre3)
 [1] 133
 
 This shows that my code did NOT achieve the intended outcome.
 Please assist. Thanks.
  
 Lexi
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting rows by maximum value of one variables in dataframe nested by another Variable

2012-06-27 Thread Petr PIKAL
Hi
 
 How could I select the rows of a dataset that have the maximum value in 
 one variable and to do this nested in another variable. It is a 
dataframe 
 in long format with repeated measures per subject. 
 I was not successful using aggregate, because one of the columns has 

You could do it by aggregate and subsequent selection matching values from 
your data frame but it is perfect example for powerfull list operations

 do.call(rbind,lapply(split(test, test$subject), function(x) 
x[which.max(x[,2]),]))
  subject time.ms  V3
1   1  22 stringC
2   2  25 stringA
 

split splits data frame test according to subject variable into list of 
sub data frames
function x computes which is maximum value in second column in each sub 
data frame and selects the appropriate row
do.call takes the list and rbinds it to one final data frame.

Regards
Petr

 character values (and/or possibly because of another reason).
 I would like to transfer something like this: 
 subjecttime.ms  V3 
 1  1   stringA
 1  12   stringB
 1  22 stringC
 2  1stringB
 2  14   stringC
 2  25   stringA
 …. 
 To something like this: 
 subject  time.ms   V3
 1  22   stringC
 2  25stringA
 … 
 
 Thank you very much for you help!
 Miriam
 -- 
 
 Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis in r plot

2012-06-25 Thread Petr PIKAL
Hi

 
 I have the graph plotted with x axis(-50 to 250) and y axis (-50 to 
500).I
 need the x axis values(-50 to 250) with spacing between two  tick marks 
as
 1or 0.1.The graph should be wider to get enough resolution.

For the first case you shall have some special display as you will need at 
least 15000 pixels wide screen. 

 length(seq(-50,250,.1))
[1] 3001

I assume 4-5 pixels between ticks as minimum spacing.

For the second case it is slightly better, however still your screen 
resolution shall be at least somewhere near 2000 pixels.
 length(seq(-50,250,1))
[1] 301

According to Wikipedia 27-30 inch HD display can probably cope with your 
request.

Regards
Petr

 
 --
 View this message in context: http://r.789695.n4.nabble.com/axis-in-r-
 plot-tp4634199p4634391.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Apply() on columns

2012-06-25 Thread Petr PIKAL
 
 I do now know how to navigate through the table.
 But my question is, what kind of graphical and numerical summaries can I
 make with keeping in mind that I am interested in the average working 
hour
 per employee?

Rather vague question without data or code, so rather vague answer.

For numerical values summarised according to some factor see

?aggregate
?by
?tapply

For graphical summary maybe

?dotplot
?boxplot
or maybe you could try package ggplot2

Regards
Petr

 
 --
 View this message in context: http://r.789695.n4.nabble.com/Apply-on-
 columns-tp4633468p4634411.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Novice question about getting data into R

2012-06-22 Thread Petr PIKAL
Hi
 
 Dear Petr, 
 nbsp; 
 Thank you very much for reply. You cannot read the Chinese characters 
may 

Yes, and I cannot install it either.

 be because you don't have install this language. Do you have any idea 
how 
 solve this problem? Or who can help me? May I install Linux? 

What problem? To read chinese characters?

I tried to search CRAN with 

read chinese character

and I got many answers you could go through. However in the first glance 
it seems to me that this issue is not trivial.

Regards
Petr


 nbsp; 
 Best regards, 
 Ms. Márcia Schmaltz 修安琪 Departamento de Português / Department 
of 
 Portuguese Faculdade de CiĂŞncias Sociais e Humanas / Faculty of Social 
 Science and Humanities - UM 
澳门大学社会科学及人文学院葡语系 
 http://www.umac.mo/fsh/ciela/staff/Marcia_Schmaltz.html (+853) 6231-2114 

 and 8397-8902 -Petr Pikal-3 [via R] lt;ML-NODE
 +s789695n4634110...@n4.nabble.comgt;wrote: - 
 
 To: schmaltz lt;marc...@umac.mogt; From: Petr Pikal-3 [via R] 
lt;ml-
 node+s789695n4634110...@n4.nabble.comgt; Date: 06/21/2012 09:27PM 
 Subject: Re: Novice question about getting data into R Hi I can read the 

 example you provided without much problem. dput(head(test)) 
structure(list
 (n = 0:5, X = c(NA, NA, NA, NA, NA, NA), start = c(11185L, 39530L, 
40544L,
 109684L, 114629L, 118841L), X.1 = c(NA, NA, NA, NA, NA, NA), dur = c(1L, 

 2L, 1L, 1L, 0L, 1L), X.2 = c(NA, NA, NA, NA, NA, NA), pause = c(28344L, 
 1012L, 69139L, 4944L, 4212L, 2558L), X.3 = c(NA, NA, NA, NA, NA, NA), 
par 
 = c(0, 100, 100, 100, 0, 100), X.4 = c(NA, NA, NA, NA, NA, NA), ins = c
 (2L, 3L, 2L, 2L, 1L, 2L), X.5 = c(NA, NA, NA, NA, NA, NA), del = c(0L, 
0L,
 0L, 0L, 0L, 0L), X.6 = c(NA, NA, NA, NA, NA, NA), sid = structure(c(10L, 

 13L, 16L, 1L, 11L, 12L), .Label = c( -1,  -1+11+13+15,  -1+110,  
-1
 +16,  -1+26+29,  -1+27+30,  -1+32,  -1+4+5,  -1+48,  1,  
 17,  18+19,  2,  20,  28,  3,  36,  37,  38,  42,  
 43,  45,  49,  50,  53,  54,  58,  59,  61+64), class = 

 factor), X.7 = c(NA, NA, NA, NA, NA, NA), tid = structure(c(1L, 6L, 
20L,
 30L, 38L, 39L), .Label = c( 1,  10+11+12,  13+14,  15+16+17,  
18
 +19,  2+3,  20,  21,  22,  23,  24+25,  26,  27+28+29, 
 
 30+31+32,  33+34,  35,  36+37,  38,  39,  4,  40,  41, 
 
 42,  43,  44+45,  46,  47,  48,  49,  5,  50,  51,  
52
 +93,  53,  54,  55,  56,  6,  7,  8,  9), class = 
 factor), X.8 = c(NA, NA, NA, NA, NA, NA), str = structure(c(5L, 6L, 
5L, 
 5L, 4L, 5L), .Label = c( ,,  ,_,  .,  quot;,  颯,  
 颯quot;,  颯,  颯颯,  颯颯quot;), class = 
 factor)), .Names = c(n, X, start, X.1, dur, X.2, pause, 
 X.3, par, X.4, ins, X.5, del, X.6, sid, X.7, tid, 
X.
 8, str ), row.names = c(NA, 6L), class = data.frame) Only Chinese 
 characters are missing and some extra columns appear gt; str(test) 
 'data.frame': nbsp; 41 obs. of nbsp;19 variables: nbsp;$ n nbsp; 
 nbsp;: int nbsp;0 1 2 3 4 5 6 7 8 9 ... nbsp;$ X nbsp; nbsp;: logi 
 nbsp;NA NA NA NA NA NA ... nbsp;$ start: int nbsp;11185 39530 40544 
 109684 114629 118841 121400 128201 129793 131852 ... nbsp;$ X.1 nbsp;: 

 logi nbsp;NA NA NA NA NA NA ... nbsp;$ dur nbsp;: int nbsp;1 2 1 1 0 
1
 1 1 436 608 ... nbsp;$ X.2 nbsp;: logi nbsp;NA NA NA NA NA NA ... 
 nbsp;$ pause: int nbsp;28344 1012 69139 4944 4212 2558 6800 1591 1623 
 3573 ... nbsp;$ X.3 nbsp;: logi nbsp;NA NA NA NA NA NA ... nbsp;$ 
par 
 nbsp;: num nbsp;0 100 100 100 0 100 100 100 0 100 ... nbsp;$ X.4 
 nbsp;: logi nbsp;NA NA NA NA NA NA ... nbsp;$ ins nbsp;: int nbsp;2 
3
 2 2 1 2 2 2 3 3 ... nbsp;$ X.5 nbsp;: logi nbsp;NA NA NA NA NA NA ... 

 nbsp;$ del nbsp;: int nbsp;0 0 0 0 0 0 0 0 0 0 ... nbsp;$ X.6 
nbsp;: 
 logi nbsp;NA NA NA NA NA NA ... nbsp;$ sid nbsp;: Factor w/ 29 levels 

 -1, -1+11+13+15,..: 10 13 16 1 11 12 1 1 2 4 ... nbsp;$ X.7 nbsp;: 
 logi nbsp;NA NA NA NA NA NA ... nbsp;$ tid nbsp;: Factor w/ 41 levels 

 1, 10+11+12,..: 1 6 20 30 38 39 40 41 2 3 ... nbsp;$ X.8 nbsp;: 
logi 
 nbsp;NA NA NA NA NA NA ... nbsp;$ str nbsp;: Factor w/ 9 levels  
,, 
 ,_, .,..: 5 6 5 5 4 5 5 5 6 6 ... gt; sessionInfo() R Under 
 development (unstable) (2012-03-03 r58569) Platform: 
i386-pc-mingw32/i386 
 (32-bit) locale: [1] LC_COLLATE=Czech_Czech Republic.1250 
 nbsp;LC_CTYPE=Czech_Czech Republic.1250 [3] LC_MONETARY=Czech_Czech 
 Republic.1250 LC_NUMERIC=C nbsp; [5] LC_TIME=Czech_Czech Republic.1250 
 Regards Petr gt; Dear Professor Daalgard, gt; gt; I beginning to 
 participate in one research of statiscal modelling of gt; 
 translators'activity data, and recently install R and try to generate 
the 
 gt; one Translation Progress Graph, as my colleagues do (with sucess), 
 but in my gt; Windows platform was found the error below. According 
 R'FAQs, it seems to be gt; very common error, as I'm not even familiar 
 with the program R and even with gt; the ProGra, could you help me? 
 Please! gt; gt; Note: the Translation Progress Graph is compost by 
 quintuple data {S, T, A, 
 gt; F, K} for Source

Re: [R] removing NA from a data frame

2012-06-22 Thread Petr PIKAL
Hi

both na.omit and complete cases works for me smoothly when NA is not a 
valid level in factor.

If this is the case, as it seems to be, you need reset your factor levels 
so that NA is not a valid level.

ex10s$dg - factor( ex10s$dg )

both commands shall work than.

Regards
Petr


 
 Removing rows with NAs, using na.omit(), doesn't seem to be working for 
me.
 
 Dataset:
 
  str ( ex10s )
 
 'data.frame':   2189576 obs. of  5 variables:
 $ LOPNR  : int  58 58 58 58 64 64 64 64 64 64 ...
 $ DIAGNOS: Factor w/ 173 levels F20,F200,F2000,..: 128 128 128 128 

 105 105 105 160 105 105 ...
 $ X_DATE : int  20060821 20061207 20080102 20090904 20010327 20010925 
 20020307 20021007 20021007 20030320 ...
 $ SOURCE : int  2 2 2 2 2 2 2 2 2 1 ...
 $ dg : Factor w/ 7 levels 0,1,2,3,..: 6 6 6 6 5 5 5 6 5 5 
...
 
 The only NAs are in the factor dg (put in by 'recode' from the car 
 library; I'm trying to eliminate cases with particular factor levels)
 
  table ( ex10s$dg )
 
   0   1   2   3   4   5  NA
2851  271501   63112   98425  335593 1257299  160795
 
 So, I remove the rows with NAs, to a new dataframe ex10ss:
 
  ex10ss-na.omit(ex10s)
 
 Check all the NAs have been removed:
 
  table(ex10ss$dg)
 
   0   1   2   3   4   5  NA
2851  271501   63112   98425  335593 1257299  160795
 
  dim(ex10s)
 [1] 2189576   5
  dim(ex10ss)
 [1] 2189576   5
 
 Nothing seems to have changed. I want all the rows with NA in removed.
 
 I am clearly doing something wrong.
 
 The only alternative I could find is pretty similar:
 use - complete.cases ( ex10 )
 ex10ss-ex10s[use,]
 which leads to the same result.
 
 
 Stuart
 
 
 Dr Stuart John Leask DM FRCPsych MB Mchir
 Clinical Senior Lecturer and Honorary Consultant Pychiatrist
 Institute of Mental Health, Innovation Park
 Triumph Road, Nottingham, Notts. NG7 2TU. UK
 Tel. +44 115 82 30419 stuart.le...@nottingham.ac.uk
 mailto:stuart.le...@nottingham.ac.uk
 Google 'Dr Stuart Leask'
 
 
 This message and any attachment are intended solely for the addressee 
and 
 may contain confidential information. If you have received this message 
in
 error, please send it back to me, and immediately delete it.   Please do 

 not use, copy or disclose the information contained in this message or 
in 
 any attachment.  Any views or opinions expressed by the author of this 
 email do not necessarily reflect the views of the University of 
Nottingham.
 
 This message has been checked for viruses but the contents of an 
attachment
 may still contain software viruses which could damage your computer 
system:
 you are advised to perform your own checks. Email communications with 
the
 University of Nottingham may be monitored as permitted by UK 
legislation.
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert 'character' vector containing mixed formats to 'Date'

2012-06-21 Thread Petr PIKAL
Hi

 
 Dear all
 I have a 'character' vector containing mixed formats (thanks Excel!)
 and I'd like to translate it into a default %Y-%m-%d Date vector.
 x - c(1/3/2005, 13/04/2004, 2/5/2005, 2/5/2005, 7/5/2007,
22/04/2004, 21/04/2005, 20080430, 13/05/2003, 20080529,
NA, NA, 19/05/1999, 17/05/2000, 17/05/2000)
 
 
 In the above you will see that some dates are of format=%d/%m/%Y,
 others of format=%Y%m%d and some NA values. Can you suggest a
 straight-forward way of transforming these to a uniform 'character' or
 'Date' vector? I tried to do the following, but it outputs very
 strange results:
  x
  [1] 1/3/2005   13/04/2004 2/5/2005   2/5/2005   7/5/2007
 22/04/2004
  [7] 21/04/2005 20080430   13/05/2003 20080529   NA
 NA
 [13] 19/05/1999 17/05/2000 17/05/2000
  sum(xa - grepl('/', x))
 [1] 11
  sum(xb  - grepl('200', substr(x, 1,4)))
 [1] 2
  sum(xc - is.na(x))
 [1] 2
  x[xa] - as.Date(x[xa], format=%d/%m/%Y)
  x[xb] - as.Date(x[xb], format=%Y%m%d)
  x
  [1] 12843 12521 12905 12905 13640 12530 12894 13999
 12185 14028
 [11] NA  NA  10730 11094 11094
 

You can use another as.Date with origin specified.

as.Date(ifelse(ind, as.Date(x, format=%d/%m/%Y), as.Date(x, 
format=%Y%m%d)) , origin=1970-01-01)
 [1] 2005-03-01 2004-04-13 2005-05-02 2005-05-02 2007-05-07
 [6] 2004-04-22 2005-04-21 2008-04-30 2003-05-13 2008-05-29
[11] NA   NA   1999-05-19 2000-05-17 2000-05-17
 
Regards
Petr

 
 The culprit is likely that the 'x' vector is 'character' throughout,
 but I'm not sure how to work around. For example, I couldn't figure
 how to create an empty 'Date' vector. Regards
 Liviu
 
 
 -- 
 Do you know how to read?
 http://www.alienetworks.com/srtest.cfm
 http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
 Do you know how to write?
 http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Novice question about getting data into R

2012-06-21 Thread Petr PIKAL
Hi

I can read the example you provided without much problem.

dput(head(test))
structure(list(n = 0:5, X = c(NA, NA, NA, NA, NA, NA), start = c(11185L, 
39530L, 40544L, 109684L, 114629L, 118841L), X.1 = c(NA, NA, NA, 
NA, NA, NA), dur = c(1L, 2L, 1L, 1L, 0L, 1L), X.2 = c(NA, NA, 
NA, NA, NA, NA), pause = c(28344L, 1012L, 69139L, 4944L, 4212L, 
2558L), X.3 = c(NA, NA, NA, NA, NA, NA), par = c(0, 100, 100, 
100, 0, 100), X.4 = c(NA, NA, NA, NA, NA, NA), ins = c(2L, 3L, 
2L, 2L, 1L, 2L), X.5 = c(NA, NA, NA, NA, NA, NA), del = c(0L, 
0L, 0L, 0L, 0L, 0L), X.6 = c(NA, NA, NA, NA, NA, NA), sid = 
structure(c(10L, 
13L, 16L, 1L, 11L, 12L), .Label = c( -1,  -1+11+13+15,  -1+110, 
 -1+16,  -1+26+29,  -1+27+30,  -1+32,  -1+4+5,  -1+48, 
 1,  17,  18+19,  2,  20,  28,  3,  36,  37, 
 38,  42,  43,  45,  49,  50,  53,  54,  58, 
 59,  61+64), class = factor), X.7 = c(NA, NA, NA, NA, 
NA, NA), tid = structure(c(1L, 6L, 20L, 30L, 38L, 39L), .Label = c( 1, 
 10+11+12,  13+14,  15+16+17,  18+19,  2+3,  20, 
 21,  22,  23,  24+25,  26,  27+28+29,  30+31+32, 
 33+34,  35,  36+37,  38,  39,  4,  40,  41, 
 42,  43,  44+45,  46,  47,  48,  49,  5,  50, 
 51,  52+93,  53,  54,  55,  56,  6,  7,  8, 
 9), class = factor), X.8 = c(NA, NA, NA, NA, NA, NA), str = 
structure(c(5L, 
6L, 5L, 5L, 4L, 5L), .Label = c( ,,  ,_,  .,  ・,  ・・, 
 ・・・,  ・・・.,  ,  ・), class = factor)), .Names = c(n, 
X, start, X.1, dur, X.2, pause, X.3, par, X.4, 
ins, X.5, del, X.6, sid, X.7, tid, X.8, str
), row.names = c(NA, 6L), class = data.frame)

Only Chinese characters are missing and some extra columns appear

 str(test)
'data.frame':   41 obs. of  19 variables:
 $ n: int  0 1 2 3 4 5 6 7 8 9 ...
 $ X: logi  NA NA NA NA NA NA ...
 $ start: int  11185 39530 40544 109684 114629 118841 121400 128201 129793 
131852 ...
 $ X.1  : logi  NA NA NA NA NA NA ...
 $ dur  : int  1 2 1 1 0 1 1 1 436 608 ...
 $ X.2  : logi  NA NA NA NA NA NA ...
 $ pause: int  28344 1012 69139 4944 4212 2558 6800 1591 1623 3573 ...
 $ X.3  : logi  NA NA NA NA NA NA ...
 $ par  : num  0 100 100 100 0 100 100 100 0 100 ...
 $ X.4  : logi  NA NA NA NA NA NA ...
 $ ins  : int  2 3 2 2 1 2 2 2 3 3 ...
 $ X.5  : logi  NA NA NA NA NA NA ...
 $ del  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ X.6  : logi  NA NA NA NA NA NA ...
 $ sid  : Factor w/ 29 levels  -1, -1+11+13+15,..: 10 13 16 1 11 12 1 
1 2 4 ...
 $ X.7  : logi  NA NA NA NA NA NA ...
 $ tid  : Factor w/ 41 levels  1, 10+11+12,..: 1 6 20 30 38 39 40 41 2 
3 ...
 $ X.8  : logi  NA NA NA NA NA NA ...
 $ str  : Factor w/ 9 levels  ,, ,_, .,..: 5 6 5 5 4 5 5 5 6 6 ...

 sessionInfo()
R Under development (unstable) (2012-03-03 r58569)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Czech_Czech Republic.1250  LC_CTYPE=Czech_Czech 
Republic.1250 
[3] LC_MONETARY=Czech_Czech Republic.1250 LC_NUMERIC=C  
[5] LC_TIME=Czech_Czech Republic.1250 

Regards
Petr

 Dear Professor Daalgard,
 
 I beginning to participate in one research of statiscal modelling of
 translators'activity data, and recently install R and try to generate 
the
 one Translation Progress Graph, as my colleagues do (with sucess), but 
in my
 Windows platform was found the error below. According R'FAQs, it seems 
to be
 very common error, as I'm not even familiar with the program R and even 
with
 the ProGra, could you help me? Please!
 
 Note: the Translation Progress Graph is compost by quintuple data {S, T, 
A,
 F, K} for Source and Target Text, Alignment, Fixation and Keyboar data,
 respectively. 
 
 
 ReadData(C:/Users/schmaltz/Dropbox/EN-CH/proGra/EN-ZH_P2_T4_T2)
 Reading Fixation Units:
 C:/Users/schmaltz/Dropbox/EN-CH/proGra/EN-ZH_P2_T4_T2 .fu
 Reading Production Units:
 C:/Users/schmaltz/Dropbox/EN-CH/proGra/EN-ZH_P2_T4_T2 .pu
 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, 
na.strings, : 
 line 38 did not have 10 elements
 
 Note: We try to delete the line 38, and the program results in another 
line
 error. Even delete all lines after, the some error occur. I think is not 
one
 encoding error, due the fact my colleague use Linux, and I Windows.
 
 Sample of file above:
 n   start   dur   pause   par   ins   del   sid   tid   str
 0   11185   1   28344   0   2   0   1   1   尽管
 1   39530   2   1012   100.00   3   0   2   2+3   发展中
 2   40544   1   69139   100.00   2   0   3   4   国家
 3   109684   1   4944   100.00   2   0   -1   5   关于
 4   114629   0   4212   0   1   0   17   6   为
 5   118841   1   2558   100.00   2   0   18+19   7   贫困
 6   121400   1   6800   100.00   2   0   -1   8   人民
 7   128201   1   1591   100.00   2   0   -1   9   争取
 8   129793   436   1623   0   3   0   -1+11+13+15   10+11+12   更好的
 9   131852   608   3573   100.00   3   0   -1+16   13+14   生活的
 10   136033   1202   1309   100.00   5   0   -1+4+5   15+16+17   说辞是
可以
 11   138544   468   3682   100.00   3   0   -1   18+19   理解的
 12   142694   359   10811   0   2   0   20   20   ,_
 13   153864   0   2121   0   1   0   -1   21   但
 14   155985   1   

Re: [R] Problem with predict?

2012-06-20 Thread Petr PIKAL
Hi

 
 Hello,
 
 I am trying to fit a model to some death over time data that does not 
 fit the criteria for the usual LD50 type models (the counts are too 
 large). I am using a simple linear model in an attempt to plot a nice 
line
 on a scatter plot and calculate some LD values to use in designing an 
 experiment. Here is the basic idea of what I'm doing:
 
 
 head(mort)
 
 TimeDensity
 022
 021
 019
 519
 514
 5 9
 
 
 plot(Density~Time)
 
 This plots something that looks a lot like a decay rate
 
 mod-lm(log(Density)~Time)
 
 xv-seq(0,60,0.1)
 yv-exp(predict(mod,list(time=xv)))

From help page

Usage
## S3 method for class 'lm'
predict(object, newdata, se.fit = FALSE, scale = NULL, df = Inf,
interval = c(none, confidence, prediction),
level = 0.95, type = c(response, terms),
terms = NULL, na.action = na.pass,
pred.var = res.var/weights, weights = 1, ...)

Arguments
object Object of class inheriting from lm
 
newdata An optional data frame in which to look for variables with which 
to predict. If omitted, the fitted values are used.
^ 

yv-exp(predict(mod, data.frame(time=xv)))

shall work.

Regards
Petr

 lines(xv,yv)
 
 Everything seems to work fine until I try to plot the lines, but then I 
 get the error message:
 
 Error in xy.coords(x, y) : 'x' and 'y' lengths differ
 
 Checking the lengths of x and y confirms that somehow the object yv is 
not
 using xv to predict the data, and is only predicting as many data points 

 as there are rows in the data frame.
 
 Any ideas on why this might be would be very much appreciated.
 
 Thank you,
 
 Julie
 
 
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Apply() on columns

2012-06-20 Thread Petr PIKAL
Hi
 
 Hi,
 
 Yes, the columns are related: V1 is related to V6, V2 is related to V7 
and
 so on. The columns V1,V2,V3,V4,V5 contains the number of employees (in a
 filling team). The columns V6,V7,V8,V9,V10 contains the number of worked
 hours of the filling team. 

You shall rather include your data.frame to email by dput instead of just 
vaguely describing it. If it is too big use only part of it.


 What I am interested in is the average working hours per employee. 
Therefore
 I have go give some graphical and numerical summaries to present the 
average
 working hours per employee.  The function that I must use is apply().

Why do you need to use apply? Is it a homework? There is no homework 
policy on this list.

I would use some kind of ?aggregate construction.

Regards
Petr

 
 How can I present with that function a graphical and numerical summarie 
of
 the average working hours per employee? [some example and explanation 
would
 be respected]
 
 Yours,
 
 FA Elsendoorn
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Apply-on-
 columns-tp4633468p4633917.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can not read a table

2012-06-20 Thread Petr PIKAL
Hi

 
 I have a table like the following: 
 
 TABLE NO.  1 
 ID   TIME
 1325   0
 1325   0
 .   .
 .   .
 .   .
 TABLE NO.  1 
 ID   TIME
 1325   0
 1325   0
 .   .
 .   .
 .   .
 TABLE NO.  1 
 ID   TIME
 1325   0
 1325   0
 .   .
 .   .
 .   .
 TABLE NO.  1 
 ID   TIME
 1325   0
 1325   0
 .   .
 .   .
 .   .
 
 I used the following code:
 sim - read.table(sim.tab, skip=1, as.is=T,header=T)

I would read the file by readLines, get rid of lines containing Table 
and ID strings, and split the values to 2 columns
Probably not the best approach but something like

test-readLines(clipboard)
test-test[-grep(TABLE, test)]
test-test[-grep(ID, test)]
t(sapply(strsplit(test, \t\t), as.numeric))
 [,1] [,2]
[1,] 13250
[2,] 13250
[3,] 13250
[4,] 13250

can work
You can transfer it to data frame easily.

Regards
Petr


 it did not work, as there're rows with characters in between the data. 
 Can anyone help me to read the table, while get rid of the character 
rows in
 between the data?
 thanks, 
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/can-not-read-
 a-table-tp4633971.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] “For” calculation is so slow

2012-05-23 Thread Petr PIKAL
 
 I'm not sure what you are trying to prove with that example - the 
loopless
 versions are massively faster, no?

In some languages loops are integral part of programming habits. In R you 
can many things do with whole objects without looping - vectorisation 
approach. See R-Inferno from Patrick Burns - circle 3.

 
 I don't disagree that loops are sometimes unavoidable, and I suppose
 sometimes loops can be faster when the non-loop version e.g. breaks your

Hm. I am not sure what it has to do with memory budget.

 memory budget, or performs tons of needless computations. But I think
 avoiding for loops whenever you can is a good rule of thumb in R coding.

I constantly use loops when I create pictures of dataframe values. 

This

library(ggplot2)
pdf(konc.pdf, 8,8, useDingbats=F)
for (i in columns) {
p-ggplot(df.name, aes(x=x.value, y=df.name[,i], colour=other.column))
print(p+geom_smooth(method=lm)+geom_point(aes(shape=some.factor, 
size=5))+scale_y_continuous(names(df.name)[i]))
}
dev.off()

can easily create pdf file with many plots of selected columns against one 
column. I know that I could use lapply, but the loop seems to me clean and 
efficient even when I plot several hundreds graphs.

Regards
Petr

 
 --
 View this message in context: http://r.789695.n4.nabble.com/For-
 calculation-is-so-slow-tp4630830p4630897.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] storing output of a loop in a matrix

2012-05-23 Thread Petr PIKAL
Hi

What is your intention?

basically one output column can be made by

cumsum(bd)

Then you can shuffle bd by let say

bd - sample(bd, 64)

and repeat cumsum for new bd.

bd-scan(blap.txt)
output-matrix(0,64,10)

for (i in 1:10) {
bd-sample(bd, 64)
cs-cumsum(bd)
output[,i]-cs
}

If you insist you can reorder resulting output.

Regards
Petr

 
 blap.txt is a numeric vector of length 64.
 
 I am using the following code:
 
 
 bd-scan(blap.txt)
 output-matrix(0,64,10)
 s-sum(bd)
 for (i in 10){
 
 while (s0)
 {x1-sample(bd,(length(bd)-1), replace=F)
 s-sum(x1)
 bd-x1
 output[i]-s
 
 }
 
 
 }
 write.table(output, file=res.txt)
 
 This code is not doing what I'd like it to do:
 
 1. It is not running through the while loop 10 times, just once.
 2. I can't work out how to get the results into the output matrix. The
 matrix should be 10 columns with the decreasing s values for each of the
 10 runs through the for loop.
 
 Sorry for any obvious mistakes. I'm very new to R and teaching myself.
 Where am I going wrong here, please?
 
 Thank you.
 
 
 --
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] “For” calculation is so slow

2012-05-22 Thread Petr PIKAL
Hi

 
 Dear All,
 
 The function I wrote can run well with the small data, but with the 
large
 data, the function runs very very slowly. How can I correct it? Thank 
you

Your function does not run slowly, it does not run at all.

 l(10,20)
Error in l(10, 20) : object 's' not found

No s object found. I presume that also gx and gy will not be found.

And besides I am pretty sure that you probably does not need any loop at 
all.

 very much. My function as below:
 
 a-c(1:240)
 b-c(1:240)
 l=function(a,b){
 v=0
 u=0
 uv=0
 v[1]=0

Why?

v-0
v
[1] 0
v[1]-0
v
[1] 0

So there is no difference between first v and second v


 u[1]=0
 uv[1]=0
 for (i in 1:(length(s)-1)){
 
v[i]-((gx[[i]][b,(gx[[i]][a,1]+1)])-(gx[[i]][a,gx[[i]][a,1]+1]))/(gx[[i]]
 [a,gx[[i]][a,1]+1])
 
u[i]-((gy[[i]][a,(gy[[i]][a,1]+1)])-(gy[[i]][b,gy[[i]][a,1]+1]))/(gy[[i]]
 [a,gy[[i]][a,1]+1])
 uv[i]-v[i]+u[i]
 }
 w=0
 w=mean(uv)

Your function does not return anything. Here you can test it by yourself 
on shortened version

l=function(a,b){
v=0
u=0
uv=0
v-10
u-20
uv-v+u
w=0
w=mean(uv)
}

Regards
Petr





 }
 kk-data.frame()
 for (a in 1:240){
 for (b in 1:240){
 if (ab)
 kk[a,b]=l(a,b)
 else kk[a,b]=0
 }}
 max(kk)
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/For-
 calculation-is-so-slow-tp4630830.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] “For” calculation is so slow

2012-05-22 Thread Petr PIKAL
Hi

 
 For loops are really, really slow in R. In general, you want to avoid 
them

I strongly disagree. ***Proper*** use of looping is quite convenient and 
reasonably fast.

Consider

 system.time( {
+ a=0
+ for (i in 1:1000) {
+ a -a+i
+ }
+ a
+ })
   user  system elapsed 
  10.220.02   10.28 
 
 system.time(b-sum(as.numeric(1:1000)))
   user  system elapsed 
   0.090.010.11 
 identical(a,b)
[1] TRUE

It is usually implementing C habits into R code what makes looping slow. 
The slowest part of a program is usually programming and it is influenced 
mainly by programmer.

Regards
Petr

 like the plague. If you absolutely must insist on using them in large,
 computationally intense and complex code, consider implementing the 
relevant
 parts in C, say, and calling that from R.
 
 Staying within R, you can probably considerably speed up that code by
 storing gx and gy as a multi-dimensional arrays. (e.g. for sample data,
 something like
 
 rawGy = sample( 1:240, 240^2* 241, replace = T)
 rawGx = sample( 1:240, 240^2 *241, replace = T)
 gx = array(rawGx, dim = c(length(s) - 1, 240,  max(rawGx)+1  ) )
 gy = array(rawGy, dim = c(length(s) - 1, 240,  max(rawGy)+1 ) )
 
  ), in which case, you can easily do the computation without loops by
 
 gxa = (gx[ ,a,1]+ 1)
 gya =(gy[ ,a, 1] +1)
 uv = gx[cbind(1:(length(s) - 1) , b, gxa)] / gx[cbind(1:(length(s) - 1) 
, a,
 gxa)]  - gy[cbind(1:(length(s) - 1) ,b, gya)]/gy[cbind(1:(length(s) - 1) 
,a,
 gya)]
 
 or similar, which will be enormously faster (on my computer, there's an 
over
 30x speed up). With a bit of thought, I'm sure you can also figure out 
how
 to let it vectorise in a, as well...
 
 Zhou
 
 --
 View this message in context: http://r.789695.n4.nabble.com/For-
 calculation-is-so-slow-tp4630830p4630855.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] “For” calculation is so slow

2012-05-22 Thread Petr PIKAL
And as followup

 system.time(d-1000*1001/2)
   user  system elapsed 
   0.020.000.02 
 identical(a,b,d)
[1] TRUE


Regards
Petr

 
 Hi
 
  
  For loops are really, really slow in R. In general, you want to avoid 
 them
 
 I strongly disagree. ***Proper*** use of looping is quite convenient and 

 reasonably fast.
 
 Consider
 
  system.time( {
 + a=0
 + for (i in 1:1000) {
 + a -a+i
 + }
 + a
 + })
user  system elapsed 
   10.220.02   10.28 
  
  system.time(b-sum(as.numeric(1:1000)))
user  system elapsed 
0.090.010.11 
  identical(a,b)
 [1] TRUE
 
 It is usually implementing C habits into R code what makes looping slow. 

 The slowest part of a program is usually programming and it is 
influenced 
 mainly by programmer.
 
 Regards
 Petr
 
  like the plague. If you absolutely must insist on using them in large,
  computationally intense and complex code, consider implementing the 
 relevant
  parts in C, say, and calling that from R.
  
  Staying within R, you can probably considerably speed up that code by
  storing gx and gy as a multi-dimensional arrays. (e.g. for sample 
data,
  something like
  
  rawGy = sample( 1:240, 240^2* 241, replace = T)
  rawGx = sample( 1:240, 240^2 *241, replace = T)
  gx = array(rawGx, dim = c(length(s) - 1, 240,  max(rawGx)+1  ) )
  gy = array(rawGy, dim = c(length(s) - 1, 240,  max(rawGy)+1 ) )
  
   ), in which case, you can easily do the computation without loops by
  
  gxa = (gx[ ,a,1]+ 1)
  gya =(gy[ ,a, 1] +1)
  uv = gx[cbind(1:(length(s) - 1) , b, gxa)] / gx[cbind(1:(length(s) - 
1) 
 , a,
  gxa)]  - gy[cbind(1:(length(s) - 1) ,b, gya)]/gy[cbind(1:(length(s) - 
1) 
 ,a,
  gya)]
  
  or similar, which will be enormously faster (on my computer, there's 
an 
 over
  30x speed up). With a bit of thought, I'm sure you can also figure out 

 how
  to let it vectorise in a, as well...
  
  Zhou
  
  --
  View this message in context: http://r.789695.n4.nabble.com/For-
  calculation-is-so-slow-tp4630830p4630855.html
  Sent from the R help mailing list archive at Nabble.com.
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ReName

2012-05-22 Thread Petr PIKAL
Hi

 
 On May 22, 2012, at 4:08 AM, HAOLONG HOU wrote:
 
  Dear list,
 
  The name of R-language is too short and is not friendly to search 
  engines.

Did you try it? Even with plain Google

something R will usually lead to quite good hit. Try e.g.

linear model R

Regards
Petr

  Do you think it can be renamed to something like Rsio or Radio ?
  Thank you so much for this useful software!
 
 The notion of renaming R to 'radio' seems at best humorous. Try 
 searching with:
 
 r-project
 CRAN
 
 Or use:Rseek.org
 
 Or use:allinurl:r-project in a google search
 
 Or: http://code.google.com/hosting/search?q=label:R+
 
 At one point I used `language:r` but I can find no support in the 
 google search documentation that currently supports that strategy.
 
 -- 
 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop, error in model frame.default ... variable lengths differ

2012-05-21 Thread Petr PIKAL
Hi

You did not provide data but I can see some problems in your code. See 
inline.
 
 I'm failing to get a for loop working.  I'm sure it's something simple, 
and I
 have found some posts relating to it, but I'm just not understanding why
 this isn't working. 
 
 I have a data frame and would like to loop through specific column 
names,
 using aggregate() within a for loop.  There are NA's scattered 
throughout
 the data frame and I'm thinking it has something to do with that, but I
 haven't been able to fix it.
 
 vars - colnames(df)[c(10,12,16,18,20,21,24:29,45)]
  for(i in 1:length(vars)) {

So i is actually values from 1 to length of vars variable.

 aggregate(colnames(df)[i] ~ x1 + x2 + x3, df, mean,

and you select variables from df[,1] to df[, length(vars)], which is 
probably not what you want.
What is x1-x3? are they variables in df?

 na.action=na.exclude)

for mean the correct statement is na.rm=TRUE

 }
 
 I get this error: 
 Error in model.frame.default(formula = colnames(df)[i] ~ x1 + x2 +   : 
   variable lengths differ (found for 'x1')

Maybe x1 has different length as df. What length(x1) and dim(df) tells 
you?

Regards
Petr

 
 There are probably much better ways to do this, and I would be happy to 
get
 suggestions, but mostly I would like to know why the code isn't working.
 
 Thanks-
 Peter
 
 --
 View this message in context: http://r.789695.n4.nabble.com/for-loop-
 error-in-model-frame-default-variable-lengths-differ-tp4630698.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace a variable by its value

2012-05-21 Thread Petr PIKAL
Hi

 
 I have a dataset called raw-data . I am trying to use the following 
code -
 
 
 col_name-names(raw_data)
 for (i in 1:(length(names(raw_data))-2))
 {
   tbl=table(raw_data$Pay.Late.Dummy, raw_data$col_name[i])
 
   chisqtest-chisq.test(tbl)
 }
 
 
 Say the 1st column of my raw_data is Column1. The idea is when i=1 then
 raw_data$col_name[i] will automatically become raw_data$Column1 , which 
is

Why do you think so? raw_data$col_name[i] is most probably one value.

Maybe you want

tbl=table(raw_data$Pay.Late.Dummy, raw_data[,i])

Regards
Petr


 not happening. Kindly help?
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Replace-a-
 variable-by-its-value-tp4630734.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing only rows/columns with na value from square ( symmetrical ) matrix.

2012-05-21 Thread Petr PIKAL
Hi

You can do it by hand and remove row/col with max number of NA values.

rem-which.max(colSums(is.na(M)))
M1-M[-rem, -rem]
rem-which.max(colSums(is.na(M1)))
M2-M1[-rem, -rem]
M2
 1   2  3   4  5   7   8  10  11  12
10 143 92 134 42 123  40 107  49  93
2  143   0 77   6 99  46  47 114 138  82
3   92  77  0   2 89  24  62  59  97  52
4  134   6  2   0 71  23  43  80  35  86
5   42  99 89  71  0  68  95  27  55  14
7  123  46 24  23 68   0 124  18  53 101
8   40  47 62  43 95 124   0 126  11 129
10 107 114 59  80 27  18 126   0  31  13
11  49 138 97  35 55  53  11  31   0  75
12  93  82 52  86 14 101 129  13  75   0

I believe this can be transformed to cycle in which you need to test 
whether there is any NA for ending a cycle or not starting it if there is 
no NA values.

Regards
Petr

 Yes  the matrix is symmetric 
 Gabor provided a partial solution:
 Try this:
 
 ix - na.action(na.omit(replace(M, upper.tri(M), 0)))
 M[-ix, -ix]
 
 However this removes all rows containing an NA in the lower half of the 
 matrix - even if the corresponding column has also been removed
 
 I I have revised the example to show this.
 
 thanks all for you help
 
 in the below case I would like to retain row and column [c(1:5,7,8,10:
 12),c(1:5,7,8,10:12)]
 M-matrix(sample(144),12,12)
 M[10,9]-NA
 M-as.matrix(as.dist(M))
 N=M
 #the above rows are to create the symmetric matrix M and a copy N
 M[6,]-NA
 M[,6]-NA
 #above two rows - make corresponding row and column NA
 print (M)
 ix - na.action(na.omit(replace(M, upper.tri(M), 0)))
 M-M[-ix, -ix]
 print (M)
 
 print (however what I would like to retain is the maximum amout of data 

 while removing rows or columns containing NA  ie:)
 print(N [c(1:5,7,8,10:12),c(1:5,7,8,10:12)])
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 thanks to all
 On 21/05/2012, at 1:10 AM, peter dalgaard wrote:
 
  
  On May 20, 2012, at 16:37 , Bert Gunter wrote:
  
  Your problem is not well-defined. In your example below, why not
  remove rows 1,2,6, and 10, all of which contain NA's? Is the matrix
  supposed to be symmetric?
 YES
 
  Do NA's always occur symmetrically?
 YES
  
  ...and even if they do, how do you decide whether to remove row/col 9 
or
 row/col 10 in the example? (Or, for that matter, between (1 and 2) and 
6. 
 In that case you might chose to remove the smallest no. of row/cols but 
in
 9 vs. 10, the situation is completely symmetric.) 
  
  
  You either need to rethink what you want to do or clarify your 
statement of it.
  
  -- Bert
  
  On Sun, May 20, 2012 at 7:17 AM, Nevil Amos nevil.a...@monash.edu 
wrote:
  I have some square matrices with na values in corresponding rows and
  columns.
  
  M-matrix(1:2,10,10)
  M[6,1:2]-NA
  M[10,9]-NA
  M-as.matrix(as.dist(M))
  print (M)
  
1 2 3 4 5 6 7 8 9 10
  1   0  2 1 2 1 NA 1 2  1  2
  2   2  0 1 2 1 NA 1 2  1  2
  3   1  1 0 2 1  2 1 2  1  2
  4   2  2 2 0 1  2 1 2  1  2
  5   1  1 1 1 0  2 1 2  1  2
  6  NA NA 2 2 2  0 1 2  1  2
  7   1  1 1 1 1  1 0 2  1  2
  8   2  2 2 2 2  2 2 0  1  2
  9   1  1 1 1 1  1 1 1  0 NA
  10  2  2 2 2 2  2 2 2 NA  0
  
  
  How do I remove just the row/column pair( in this trivial example 
row 6 and
  10 and column 6 and 10) containing the NA values?
  
  so that I end up with all rows/ columns that are not NA - e.g.
  
  1 2 3 4 5 7 8 9
  1 0 2 1 2 1 1 2 1
  2 2 0 1 2 1 1 2 1
  3 1 1 0 2 1 1 2 1
  4 2 2 2 0 1 1 2 1
  5 1 1 1 1 0 1 2 1
  7 1 1 1 1 1 0 2 1
  8 2 2 2 2 2 2 0 1
  9 1 1 1 1 1 1 1 0
  
  
  if i use na omit I lose rows 1,2,6, and 9
  which is not what I want.
  
  thanks
  --
  Nevil Amos
  Molecular Ecology Research Group
  Australian Centre for Biodiversity
  Monash University
  CLAYTON VIC 3800
  Australia
  
[[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
  
  
  -- 
  
  Bert Gunter
  Genentech Nonclinical Biostatistics
  
  Internal Contact Info:
  Phone: 467-7374
  Website:
  
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
 biostatistics/pdb-ncb-home.htm
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
  -- 
  Peter Dalgaard, Professor,
  Center for Statistics, Copenhagen Business School
  Solbjerg Plads 3, 2000 Frederiksberg, Denmark
  Phone: (+45)38153501
  Email: pd@cbs.dk  Priv: pda...@gmail.com
  
  
  
  
  
  
  
  
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org 

Re: [R] Add column from other columns data.

2012-05-17 Thread Petr PIKAL
 Something along the lines of 
 
 dat2  -  ifelse( dat1==1 , yes, no)

Another option is in this case

dat2  -  c(no, yes)[dat1+1]

Regards
Petr

 
 should do it.
 
 John Kane
 Kingston ON Canada
 
 
  -Original Message-
  From: s1010...@student.hsleiden.nl
  Sent: Mon, 14 May 2012 05:45:38 -0700 (PDT)
  To: r-help@r-project.org
  Subject: [R] Add column from other columns data.
  
  Hi everyone,
  
  I am having some problems with making a new colomn wit data in it.
  I have this one column named: Fulfilled
  
  Fulfilled
  1
  1
  0
  1
  1
  1
  1
  0
  0
  1
  
  And now I would like to add another colum to my .csv file (Finished)
  
  In this Finished column I would like to have Yes or No.
  Where in colomn Fullfilled is a 1, Finished should have a Yes.
  Like this:
  
  Fullfilled Finished
  1 Yes
  1 Yes
  0 No
  etc
  
  Now I know how to grab the data out of a column, and also know how to
  save
  data inside a .csv file.
  That is no problem.
  But how do I get the right Yes or No on the right place in the other
  column?
  
  # Get al values: 1
  Fullfilled_1 = Fullfilled[Fullfilled = 1]
  
  I was thinkng about subset.
  But I don' t realy know if that would be realy it
  
  Maybe somebody here can push me a little in the right direction?
  
  
  --
  View this message in context:
  
http://r.789695.n4.nabble.com/Add-column-from-other-columns-data-tp4629921.html

  Sent from the R help mailing list archive at Nabble.com.
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 Publish your photos in seconds for FREE
 TRY IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if4
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] max value

2012-05-17 Thread Petr PIKAL
Hi

 
 On 2012-05-15 08:36, Melissa Rosenkranz wrote:
  Here is an R problem I am struggling with:
  My dataset is organized like this...
 
  subject   sessionvariable_x   variable_y
  01 11interger values
  01 12
  01 13
  01 21
  01 22
  01 23
  02 11
  02 12
  02 13
  02 21
  02 22
  02 23
  03 11
  03 12
  03 13
  03 21
  03 22
  03 23
  ...
 
  I need to find the level of variable x at which variable y has the 
maximum
  value for each individual for each session. Then, I need to create 
another
  variable, say variable z that labels that row in the dataset as the 
max
  for that individual at that time. I have searched the archives and the 
web
  for ideas, but am having trouble finding appropriate search terms for 
what
  I need to do. Any advice? Thank you!!
 
 
 This is one way:

Here is another

d$z-ave(d$y, d$subject, d$session, FUN=function(x) x==max(x))

Regards
Petr

 
set.seed(123)
d - data.frame(
 subject = gl(3,6,labels=c(01,02,03)),
 session = gl(2,3,18),
 x = gl(3,1,18),
 y = sample(11:15, 18, replace=TRUE))
 
library(plyr)
ddply(d, .(subject, session), transform,
  z = ifelse(y == max(y), 1, 0))
 
 Peter Ehlers
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replacing with NA

2012-05-17 Thread Petr PIKAL
 
 x[is.na(z)] - NA
 
 This might send you a nasty bug if x and z are different lengths
 though -- just a head's up.

Another option
x*!is.na(z)*z

Regards
Petr


 
 Michael
 
 On Wed, May 16, 2012 at 12:55 PM, Mintewab Bezabih
 mintewab.beza...@economics.gu.se wrote:
  Dear R users,
 
  I was wondering  how I can replace the values of a vector with the 
 values from in another vector in the same row
 
  For example, how can I replace the value of x below with NA when the 
 value of Z in the same row is NA?
  x -1:20
  z- c(11, 15, 17, 2, 18, 6, 7, NA, 12, 10,21, 25, 27, 12, 28, 16,17, 
NA, 12, 10)
 
 
  Many thanks
  Mintewab
 
  
  Från: Mintewab Bezabih
  Skickat: den 15 maj 2012 15:53
  Till: r-help@r-project.org
  Kopia: r-help@r-project.org
  Ämne: missing observations
 
  Dear R users,
 
  I have missing observations in my data that I remove in my analysis. I 

 am able to run my codes alright but I want the non missing values to be 
 correctly identified and therefore want to tag my id vector along in my 
 results. Since the vector of ids has no role in the analysis, I dont 
know 
 how to include it.
 
 
 
  Here is my reprducable example:and my id is the vector I want to add 
to 
 the analysis somehow so that my missing values are identified. I cannot 
 use  na.action function and that is why I have to drop my missing 
 obesevations beforehand.
 
 
  library(fields)
  x -1:20
  y- runif(20)
  z- c(11, 15, 17, 2, 18, 6, 7, NA, 12, 10,21, 25, 27, 12, 28, 16,
  17, NA, 12, 10)
  id -1:20
 
  mydataset-data.frame(x, y, z)
  temperature[complete.cases(mydataset),]
 
   x- temperature[, c(1)]
  y- temperature[, c(2)]
  z- temperature[, c(3)]
 
  tpsfit - Tps(cbind(x, y), z, scale.type=unscaled)
 
 
 
 
  Many thanks as always.
  Regards,
  Mintewab
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error code trying to extract second column from coeftest output

2012-05-17 Thread Petr PIKAL
Hi

 
 I want to use the standard error values in the summary that is produced 
using
 coeftest, but I am getting an error code- any ideas?

See what is structure of coeftest object by

str(coeftest(lmodT_WBHO))

and from this you shall deduct how to select second column.

Regards
Petr


 
  library(lmtest)
  coeftest(lmodT_WBHO)
 
 t test of coefficients:
 
 Estimate Std. Error t value  Pr(|t|) 
 t1W  5.948190.17072 34.8410  2.2e-16 ***
 t2W  6.562160.17438 37.6322  2.2e-16 ***
 t3W  6.082520.16525 36.8082  2.2e-16 ***
 t4W  6.180410.17028 36.2949  2.2e-16 ***
 t1B  5.50.50566 10.8768  2.2e-16 ***
 t2B  5.650000.53034 10.6535  2.2e-16 ***
 t3B  4.523810.51756  8.7406  2.2e-16 ***
 t4B  4.380950.51756  8.4646  2.2e-16 ***
 t1H  5.050000.53034  9.5221  2.2e-16 ***
 t2H  4.80.55903  8.5465  2.2e-16 ***
 t3H  5.526320.54412 10.1564  2.2e-16 ***
 t4H  4.714290.63388  7.4372 2.236e-13 ***
 t1O  5.176470.57524  8.9988  2.2e-16 ***
 t2O  5.818180.50566 11.5060  2.2e-16 ***
 t3O  6.50.63388 10.2543  2.2e-16 ***
 t4O  5.714290.63388  9.0147  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 
  se1 - coeftest(lmodT_WBHO)$coef[,2]
 Error in coeftest(lmodT_WBHO)$coef : 
   $ operator is invalid for atomic vectors
  
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/error-code-
 trying-to-extract-second-column-from-coeftest-output-tp4630298.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find outliers from the list of values

2012-05-17 Thread Petr PIKAL
Hi

I had not see any answer yet but maybe there is nobody who wants to touch 
the elusive object of outlier. Neither me, but here are some ideas how 
one can proceed.

First of all its always up to you what is considered an outlier and how 
will you deal with them. 

I usually call an outlier any item which does not fit to the pattern and 
the pattern is usually best observed by some plotting function. You can 
identify outlier points, inspect the data source, correct typing mistakes 
and only if the value is really measured and you can not find any reason 
why it has such value it is real outlier. Then ***you*** need to decide 
what to do with it - discard, can come from some long tailed distribution, 
...

So here are my 0.02$ regarding an outlier theme.

Regards
Petr

 
 Hi,
 I am new to R and I would like to get your help in finding
 'outliers'.
  I have mvoutlier package installed in my system and added the package .
 But I not able find a function from 'mvoutlier' package which will 
identify
 'outliers'.
 This is the sample list of data I have got which has one out-lier.
11489  11008  11873  8000  9558  8645  8024  8371  It will be of
 great help if somebody have got an example script for the same.
 
 Thanks  Regards,
 Thomas
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix heatmap

2012-05-11 Thread Petr PIKAL
Hi

 
 how do I plot only the data below 10? everything is white for the 0-10 
and
 10-90 is black ..

What data below 10? I do not see any. You posted some mails before but I 
do not keep all mails from R. Only those which helped me somehow.

Basically

x - sample(1:100, 100, raplace=TRUE)
x[x10]

gives you only values of x below 10.

When you put it into another object you can do anything with it.

 those functions which do this?
 was bad for such basic questions, but I started tinkering with R is 6 
days

Maybe it is time to go through your R installation and find out R intro 
document. You can do save time by reding it at  least some chapters of it, 
it is not so long.

Regards
Petr

 
 --
 View this message in context: http://r.789695.n4.nabble.com/Matrix-
 heatmap-tp4619084p4625021.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: arguments must have same length

2012-05-11 Thread Petr PIKAL
Hi


 
  score-read.csv(http://users.stat.umn.edu/~chen2285/hw/ACT.csv;)


 score-read.csv(http://users.stat.umn.edu/~chen2285/hw/ACT.csv;)
Error in file(file, rt) : cannot open the connection
In addition: Warning message:
In file(file, rt) : unable to connect to 'users.stat.umn.edu' on port 
80.


Unable to read data.

 
  interaction.plot(sex,rep(1,861),score,fun=mean,legend=F,main=profile 
of
  sex)
 Error: tapply(response, list(x.factor, trace.factor), fun) : 
   arguments must have same length

However you are not telling us whole story. score shall be data frame and 
length(data.frame) results in number of columns.

From help page

x.factor a factor whose levels will form the x axis.
trace.factor another factor whose levels will form the traces.
response a numeric variable giving the response
 
your second variable to interaction plot is numeric - shall be factor
your third variable is probably data frame, shall be numeric.

It is advisable to give to functions values they expect. Some functions 
can crunch unexpected data but the results can be misleading if not wrong.

Regards
Petr

 
  length(sex)
 [1] 861
  length(type)
 [1] 861
  length(score)
 [1] 3
 
 How can I alter the length of the variable score here to make this 
function
 work?
 or any other way to perform interaction plot? 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/arguments-
 must-have-same-length-tp4625885.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] averaging two tables (rows with columns)

2012-05-10 Thread Petr PIKAL
Hi

as already mentioned your data can not be deciphered. Use

dput(table1) for sending usable data.

From what you describe probably

?aggregate can be used.
But without suitable data you hardly get any advice.

Regards
Petr

 
 
 Hi R user,I am struggling to figure out on how I can calculate the 
average
 from the two tables in R. Any one can help me? really your help  would 
be 
 grateful- I am spending so much time to figure it out. It should not be 
so
 hard, I think. 
 I have very big data but I have created a hypothetical data for 
simplification. 
 for example 
 I have : table 1
 
 
 
 
 table 1: species occurance data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 speciesX
 
 
 speciesY
 
 
 speciesZ
 
 
 speciesXX
 
 
 
 
 Plot1
 
 
 1
 
 
 0
 
 
 1
 
 
 0
 
 
 
 
 Plot2
 
 
 0
 
 
 1
 
 
 1
 
 
 0
 
 
 
 
 Plot3
 
 
 0
 
 
 0
 
 
 0
 
 
 1
 
 
 
 
 Plot4
 
 
 1
 
 
 0
 
 
 1
 
 
 0
 
 
 
 
 Table 2
 
 
 
 table 2. species tolerance data 
 
 
 
 
 
 
 
 
 
 
 
 
 EnviA
 
 
 EnviB
 
 
 EnviC
 
 
 
 
 speciesX
 
 
 0.21
 
 
 0.4
 
 
 0.17
 
 
 
 
 speciesY
 
 
 0.1
 
 
 0.15
 
 
 0.18
 
 
 
 
 speciesXX
 
 
 0.14
 
 
 0.16
 
 
 0.19
 
 
 
 You may noticed that table 2 does not have species Z which was in table 
1.
 
 
 Now I want to get the average value of species tolerance in each plot 
 based on each environmental value (EnviA or EnviB etc)The example of the 

 out come (final table I was looking for it) Results table 1a: average 
 species tolerance in each plot based on EnviA
 
 
 
 
 Result Table 3. Average species tolerance in each plot based on EnviA
 
 
 
 
 
 
 
 
 speciesX
 
 
 speciesY
 
 
 speciesZ
 
 
 speciesXX
 
 
 Average
 
 
 
 
 Plot1
 
 
 0.21
 
 
 NA
 
 
 Nodata
 
 
 0.14
 
 
 0.175
 
 
 
 
 Plot2
 
 
 NA
 
 
 0.1
 
 
 Nodata
 
 
 NA
 
 
 0.1
 
 
 
 
 Plot3
 
 
 NA
 
 
 NA
 
 
 Nodata
 
 
 0.14
 
 
 0.14
 
 
 
 
 Plot4
 
 
 0.21
 
 
 NA
 
 
 Nodata
 
 
 NA
 
 
 0.21
 
 
 
 
 Result table 1b: average species tolerance in plot based on EnviB
 
 
 
 Table 4. Average species tolerance in each plot based on EnviB
 
 
 
 
 
 
 
 
 speciesX
 
 
 speciesY
 
 
 speciesZ
 
 
 speciesXX
 
 
 Average
 
 
 
 
 Plot1
 
 
 0.4
 
 
 NA
 
 
 Nodata
 
 
 0.16
 
 
 0.28
 
 
 
 
 Plot2
 
 
 NA
 
 
 0.15
 
 
 Nodata
 
 
 NA
 
 
 0.15
 
 
 
 
 Plot3
 
 
 NA
 
 
 NA
 
 
 Nodata
 
 
 0.16
 
 
 0.16
 
 
 
 
 Plot4
 
 
 0.4
 
 
 NA
 
 
 Nodata
 
 
 NA
 
 
 0.4
 
 
 
 
 Would any one help me how I can calculate these?Thanks
 Kristi Golver==
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix heatmap

2012-05-10 Thread Petr PIKAL
Hi

what is wrong with

 heatmap(as.matrix(test), col=my.colors(25))

with test from your dput

Regards
Petr

 The heat map generated the correct result:
 
 library(gplots)
 arq -read.table(l)
 matrix_l -data.matrix(arq)
 my.colors -
 colorRampPalette(c
 
(gray0,gray10,gray20,gray30,gray40,gray50,gray60,gray80,gray90,gray100))
 heatmap.2(matrix_l,dendrogram=none, Rowv=NA, Colv=NA, 
col=my.colors(256)) 
 
 --
 
 Now I have the following file with 5 data, similar to the above:
 
RF2   RF00013   RF00100   RF00381   RF00434   RF00453   RF00165 
 RF00496   RF00497
 RF00014   RF00048   RF00234   RF00163   RF8   RF00094   RF00032 
 RF00028   RF00216
 RF00487   RF00209   RF00465   RF00485   RF00363   RF00366
 RF2   63   7   5   7   17   12   14   5   23   3   56   14   72   84 
 
 15   64   20   0   1   8   6   65   3   4
 RF00013   45   7   4   6   17   12   14   5   23   3   56   12   60   84 
 
 15   64   20   0   0   2   2   65   3   4
 RF00100   22   1   5   3   2   9   0   0   0   0   5   0   16   8   1 0 
 0   0   0   0   0   26   2   3
 RF00381   63   7   5   13   17   11   3   5   18   3   56   14   33   12 
 
 2   15   4   18   12   25   11   69   3   4
 RF00434   2   0   0   3   17   11   14   5   23   3   55   12   59   84  

 15   64   20   0   0   0   0   40   1   3
 RF00453   3   1   0   2   16   12   13   3   7   0   45   12   42   78 
 15   53   20   0   0   0   0   33   2   0
 RF00165   0   0   0   2   10   1   14   1   7   0   44   12   38   68 13
 48   20   0   0   0   0   18   0   0
 RF00496   0   0   0   0   0   0   1   5   6   0   0   0   4   2   0   0  

 0   0   0   0   0   0   0   0
 RF00497   0   0   0   3   10   0   12   5   23   3   40   8   37   77 15
 64   20   0   0   0   0   20   0   0
 RF00014   0   0   0   0   0   0   0   0   8   3   6   0   0   0   0   0  

 0   0   0   0   0   0   0   0
 RF00048   3   1   0   3   17   10   14   5   23   3   56   12   59   83  

 15   64   20   0   0   0   0   52   3   3
 RF00234   62   7   5   6   17   12   14   5   23   3   56   14   70   84 
 
 15   64   20   0   0   0   1   65   3   4
 RF00163   63   7   5   7   17   12   14   5   23   3   56   14   75   84 
 
 15   64   21   6   1   10   9   65   3   4
 RF8   3   1   0   3   17   12   14   5   23   3   56   12   58   84  

 15   64   20   0   0   0   0   52   3   2
 RF00094   0   0   0   0   0   1   11   0   1   0   0   0   34   73   15  

 49   20   0   0   0   0   12   0   0
 RF00032   0   0   0   3   10   1   14   5   23   3   56   12   43   80 
 15   64   20   0   0   0   0   21   0   0
 RF00028   63   7   5   13   17   12   14   5   23   3   56   14   75 84 
 15   64   30   23   14   25   20   85   3   4
 RF00216   63   7   5   13   17   12   14   5   23   3   56   14   75 84 
 15   64   28   23   14   25   20   85   3   4
 RF00487   63   7   5   13   17   12   14   5   23   3   56   14   75 84 
 15   64   28   20   14   25   16   83   3   4
 RF00209   50   7   5   3   2   2   0   0   0   0   1   2   26   4   0 0 
 1   0   8   25   5   28   3   3
 RF00465   59   7   5   10   7   11   0   0   10   3   11   2   32   9 1 
 3   6   15   5   14   20   63   3   4
 RF00485   63   7   5   13   17   12   14   5   23   3   56   14   75 84 
 15   64   26   17   14   25   19   85   3   4
 RF00363   5   3   0   3   10   1   1   5   20   3   50   12   44   24 5 
 5   0   0   0   0   0   42   3   3
 RF00366   8   2   1   4   14   9   13   5   23   3   52   12   51   68 
 12   8   0   0   0   0   0   48   3   4
 
 Now I have the following file with 5 data, similar to the above:
 Is represented by an array of 25x25 and 10x10 not like the previous
 
 when I give the command dput (arch) it returns me the following output:
 
 structure(list(RF2 = c(63L, 45L, 22L, 63L, 2L, 3L, 0L, 0L, 
 0L, 0L, 3L, 62L, 63L, 3L, 0L, 0L, 63L, 63L, 63L, 50L, 59L, 63L, 
 5L, 8L), RF00013 = c(7L, 7L, 1L, 7L, 0L, 1L, 0L, 0L, 0L, 0L, 
 1L, 7L, 7L, 1L, 0L, 0L, 7L, 7L, 7L, 7L, 7L, 7L, 3L, 2L), RF00100 = c(5L, 

 4L, 5L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 5L, 5L, 0L, 0L, 0L, 5L, 
 5L, 5L, 5L, 5L, 5L, 0L, 1L), RF00381 = c(7L, 6L, 3L, 13L, 3L, 
 2L, 2L, 0L, 3L, 0L, 3L, 6L, 7L, 3L, 0L, 3L, 13L, 13L, 13L, 3L, 
 10L, 13L, 3L, 4L), RF00434 = c(17L, 17L, 2L, 17L, 17L, 16L, 10L, 
 0L, 10L, 0L, 17L, 17L, 17L, 17L, 0L, 10L, 17L, 17L, 17L, 2L, 
 7L, 17L, 10L, 14L), RF00453 = c(12L, 12L, 9L, 11L, 11L, 12L, 
 1L, 0L, 0L, 0L, 10L, 12L, 12L, 12L, 1L, 1L, 12L, 12L, 12L, 2L, 
 11L, 12L, 1L, 9L), RF00165 = c(14L, 14L, 0L, 3L, 14L, 13L, 14L, 
 1L, 12L, 0L, 14L, 14L, 14L, 14L, 11L, 14L, 14L, 14L, 14L, 0L, 
 0L, 14L, 1L, 13L), RF00496 = c(5L, 5L, 0L, 5L, 5L, 3L, 1L, 5L, 
 5L, 0L, 5L, 5L, 5L, 5L, 0L, 5L, 5L, 5L, 5L, 0L, 0L, 5L, 5L, 5L
 ), RF00497 = c(23L, 23L, 0L, 18L, 23L, 7L, 7L, 6L, 23L, 8L, 23L, 
 23L, 23L, 23L, 1L, 23L, 23L, 23L, 23L, 0L, 10L, 23L, 20L, 23L
 ), RF00014 = c(3L, 3L, 0L, 3L, 3L, 0L, 0L, 0L, 3L, 3L, 3L, 3L, 
 3L, 3L, 0L, 3L, 3L, 3L, 3L, 0L, 3L, 3L, 3L, 3L), RF00048 = c(56L, 
 56L, 5L, 56L, 

Re: [R] Problem with Median

2012-05-09 Thread Petr PIKAL
Hi

 
 
 I might be silly but if I was going to type in dput() then how should I 
 send the data over here? 

 dput(zdrz20)

outputs tou your console

structure(list(sklon = c(95, 95, 40, 40, 40, 40, 20, 20, 20, 
20, 20, 20, 20), ot = c(15, 4, 10, 15, 4, 1.5, 1.5, 4, 10, 15, 
4, 10, 15), doba = c(5.88333, 15.75, 12.5, 9.16667, 27, 
65.1667, 88, 38.25, 17., 12.5, 38.2, 17.3, 12.5)), .Names = 
c(sklon, 
ot, doba), row.names = c(NA, 13L), class = data.frame)

you can copy it to your mail and anybody can just paste this and assign it 
to an object.

a.AC - subset(data, class == A-C, select = a)
This result probably in data frame (you can check by str(a.AC)) and as 
such you can not put it directly to median function.

Regards
Petr


 Instead, I've just uploaded the image online, you can access it via the 
link below.
 http://i1165.photobucket.com/albums/q585/halfpirate/data.jpg 
 
  Date: Mon, 7 May 2012 14:55:24 -0400
  Subject: Re: [R] Problem with Median
  From: sarah.gos...@gmail.com
  To: bell_beaut...@hotmail.com
  CC: r-help@r-project.org
  
  Please use dput() to give us your data (eg dput(data) ) rather than
  simply pasting it in.
  
  Sarah
  
  On Mon, May 7, 2012 at 2:52 PM, Suhaila Haji Mohd Hussin
  bell_beaut...@hotmail.com wrote:
  
   Hello.
   I'm trying to compute median for a filtered column based on other 
 column but there was something wrong. I'll show how I did step by step.
   Here's the data:
   a b c  class
  
   1   12   0  90 A-B2   3 9711 A-B3   78   NA 123
 A-C4   NA   NA12A-C5   8 33 2 A-B6   12   NA 0  
A-D
   On the command I typed:
   1) data = read.csv(data.csv)
  
   2) a.AC - subset(data, class == A-C, select = a)
   3) median(a.AC)Error in median.default(a.AC) : need numeric data
   4) is.numeric(a.AC)FALSE
   5) as.numeric(a.AC)Error: (list) object cannot be coerced to type 
'double'
   How can I fix this? Please help.
   Cheers,Suhaila
  
  
  -- 
  Sarah Goslee
  http://www.functionaldiversity.org
   __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix heatmap

2012-05-09 Thread Petr PIKAL
Hi

 
 arq -read.table(file) 
  arq_matrix -data.matrix(arq) 

Are you sure that arg_matrix is numeric? Did you check it somehow?

 dput(arq)

You forgot to include dput(arg) result. Without that only you know what 
arg is.

  arq_heatmap - heatmap(arq_matrix, Rowv = NA, Colv = NA,col = 
heat.colors
  (256), scale = column, margins =c(5,10)) 
 
 dput done with this command, but still gave the same ..
 
 I do it before generating the heatmap?
 
 would be this way?

Which way?

Regards
Petr

 
 --
 View this message in context: http://r.789695.n4.nabble.com/Matrix-
 heatmap-tp4619084p4619284.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems Exporting R Output to an xls file need help

2012-05-07 Thread Petr PIKAL
Hi

Or you can use base function

write.table(varxls, file=some output file.xls, sep = \t, row.names = 
F)

Regards
Petr

 
 As it says, you need to supply a file name to be outputted to.
 
 write.xls(, file = abc.xls)
 
 Michael
 
 On Fri, May 4, 2012 at 11:31 AM, PaulJr paulberna...@gmail.com wrote:
  Hello R users,
 
  I want to export to an xls or .csv some predictions I produced with 
the
  auto.arima and forecast functions.
 
  A detail of all my work is presented below. I loaded a package called
  dataframes2xls and tried to use the  function write.xls without any 
success.
 
  Can anybody help me figure this out? How could I get R to export the 
output
  to an xls file?
 
  Any help will be greatly appreciated.
 
  dryfit-auto.arima(testing1$pcumsdry)
  dryfit
  Series: testing1$pcumsdry
  ARIMA(2,1,1) with drift
 
  Coefficients:
  ar1ar2  ma1   drift
   0.2684  0.109  -0.8906  -15265.776
  s.e.  0.1145  0.102   0.07988047.169
 
  sigma^2 estimated as 2.869e+11:  log likelihood=-2265.02
  AIC=4540.03   AICc=4540.43   BIC=4555.25
  forecast(dryfit, h=60)
 Point Forecast   Lo 80   Hi 80   Lo 95   Hi 95
  1574074831 3388438 4761224 3025083 5124579
  1584091730 3357988 4825473 2969569 5213892
  1594082508 3316633 4848383 2911204 5253812
  1604072372 3289497 4855247 2875069 5269675
  1614059142 3263395 4854890 2842151 5276133
  1624044982 3238520 4851444 2811605 5278359
  1634030235 3214025 4846446 2781949 5278522
  1644015229 3189787 4840672 2752824 5277635
  165490 3165706 4834474 2724010 5276170
  1663984886 3141748 4828025 2695417 5274355
  1673969651 3117892 4821410 2666998 5272304
  1683954400 3094129 4814672 2638728 5270072
  1693939142 3070452 4807832 2610595 5267689
  1703923880 3046856 4800903 2582588 5265171
  1713908616 3023340 4793891 2554704 5262527
  1723893351 201 4786800 2526937 5259764
  1733878085 2976536 4779635 2499284 5256886
  1743862820 2953243 4772397 2471742 5253898
  1753847554 2930020 4765088 2444307 5250802
  1763832288 2906866 4757711 2416976 5247600
  1773817023 2883778 4750267 2389748 5244297
  1783801757 2860755 4742758 2362619 5240895
  1793786491 2837796 4735186 2335587 5237395
  1803771225 2814899 4727552 2308650 5233801
  1813755960 2792062 4719857 2281805 5230114
  1823740694 2769284 4712104 2255051 5226337
  1833725428 2746564 4704292 2228384 5222472
  1843710162 2723900 4696425 2201804 5218521
  1853694896 2701291 4688502 2175308 5214485
  1863679631 2678736 4680525 2148894 5210367
  1873664365 2656234 4672496 2122561 5206169
  1883649099 2633783 4664415 2096307 5201891
  1893633833 2611383 4656284 2070130 5197536
  1903618568 2589032 4648103 2044029 5193106
  1913603302 2566730 4639874 2018002 5188602
  1923588036 2544475 4631597 1992047 5184025
  1933572770 2522266 4623274 1966163 5179377
  1943557504 2500104 4614905 1940349 5174659
  1953542239 2477986 4606492 1914604 5169873
  1963526973 2455911 4598035 1888925 5165020
  1973511707 2433880 4589534 1863313 5160101
  1983496441 2411891 4580992 1837765 5155118
  1993481176 2389943 4572408 1812280 5150071
  2003465910 2368036 4563783 1786857 5144962
  2013450644 2346169 4555119 1761496 5139792
  2023435378 2324341 4546415 1736194 5134562
  2033420112 2302552 4537673 1710951 5129273
  2043404847 2280801 4528893 1685767 5123927
  2053389581 2259087 4520075 1660639 5118523
  2063374315 2237409 4511221 1635567 5113063
  2073359049 2215767 4502332 1610550 5107549
  2083343784 2194161 4493406 1585587 5101980
  2093328518 2172589 4484446 1560678 5096358
  2103313252 2151052 4475452 1535820 5090684
  2113297986 2129548 4466424 1511015 5084958
  2123282720 2108078 4457363 1486259 5079181
  2133267455 2086640 4448270 1461554 5073355
  2143252189 2065234 4439144 1436898 5067480
  2153236923 2043860 4429987 1412290 5061556
  2163221657 2022517 4420798 1387730 5055585
  utils:::menuInstallLocal()
  package 'dataframes2xls' successfully unpacked and MD5 sums checked
  varxls-forecast(dryfit, h=60)
  write.xls(varxls)
  Error: could not find function write.xls
  library(dataframes2xls)
  write.xls(varxls)
  [1] Does 'python' exist, and is it in the path?
  Error in paste(-o , file,  , sep = ) :
   argument file is missing, with no default
 
 
  --
  View this message in context: http://r.789695.n4.nabble.com/Problems-
 Exporting-R-Output-to-an-xls-file-need-help-tp4608912.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org 

Re: [R] how to do the concentration-time profiles in R?

2012-05-07 Thread Petr PIKAL
 
 Hi, Dear all,
 
 Could you please tell me how to select specified column in dataset and 
how
 to do

my.data[, x]
my.data[, aa]
my.data$aa


 the concentration-time profiles in R?

What is that?

Besides you could help youself a lot reading an intro to R which I believe 
is in doc directory of your R installation.

Regards
Petr


 
 Thank you!
 
 xiaoc
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems Exporting R Output to an xls file need help

2012-05-07 Thread Petr PIKAL
Hi

 
 Petr, your code does not create a native Excel file, and it is 
misleading 
 to name it with an xls extension.

Yes, you are right. It saves tab delimited file without row names which 
can be directly opened by Excel by double clicking. At least in my comps. 
I agree that it is not propper Excel file but it behaves like one so for 
me it is an Excel file.

Regards
Petr

 
---
 Jeff NewmillerThe .   .  Go 
Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live 
Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#. 
rocks...1k
 
--- 

 Sent from my phone. Please excuse my brevity.
 
 Petr PIKAL petr.pi...@precheza.cz wrote:
 
 Hi
 
 Or you can use base function
 
 write.table(varxls, file=some output file.xls, sep = \t, row.names
 = 
 F)
 
 Regards
 Petr
 
  
  As it says, you need to supply a file name to be outputted to.
  
  write.xls(, file = abc.xls)
  
  Michael
  
  On Fri, May 4, 2012 at 11:31 AM, PaulJr paulberna...@gmail.com
 wrote:
   Hello R users,
  
   I want to export to an xls or .csv some predictions I produced with
 
 the
   auto.arima and forecast functions.
  
   A detail of all my work is presented below. I loaded a package
 called
   dataframes2xls and tried to use the  function write.xls without any
 
 success.
  
   Can anybody help me figure this out? How could I get R to export
 the 
 output
   to an xls file?
  
   Any help will be greatly appreciated.
  
   dryfit-auto.arima(testing1$pcumsdry)
   dryfit
   Series: testing1$pcumsdry
   ARIMA(2,1,1) with drift
  
   Coefficients:
   ar1ar2  ma1   drift
0.2684  0.109  -0.8906  -15265.776
   s.e.  0.1145  0.102   0.07988047.169
  
   sigma^2 estimated as 2.869e+11:  log likelihood=-2265.02
   AIC=4540.03   AICc=4540.43   BIC=4555.25
   forecast(dryfit, h=60)
  Point Forecast   Lo 80   Hi 80   Lo 95   Hi 95
   1574074831 3388438 4761224 3025083 5124579
   1584091730 3357988 4825473 2969569 5213892
   1594082508 3316633 4848383 2911204 5253812
   1604072372 3289497 4855247 2875069 5269675
   1614059142 3263395 4854890 2842151 5276133
   1624044982 3238520 4851444 2811605 5278359
   1634030235 3214025 4846446 2781949 5278522
   1644015229 3189787 4840672 2752824 5277635
   165490 3165706 4834474 2724010 5276170
   1663984886 3141748 4828025 2695417 5274355
   1673969651 3117892 4821410 2666998 5272304
   1683954400 3094129 4814672 2638728 5270072
   1693939142 3070452 4807832 2610595 5267689
   1703923880 3046856 4800903 2582588 5265171
   1713908616 3023340 4793891 2554704 5262527
   1723893351 201 4786800 2526937 5259764
   1733878085 2976536 4779635 2499284 5256886
   1743862820 2953243 4772397 2471742 5253898
   1753847554 2930020 4765088 2444307 5250802
   1763832288 2906866 4757711 2416976 5247600
   1773817023 2883778 4750267 2389748 5244297
   1783801757 2860755 4742758 2362619 5240895
   1793786491 2837796 4735186 2335587 5237395
   1803771225 2814899 4727552 2308650 5233801
   1813755960 2792062 4719857 2281805 5230114
   1823740694 2769284 4712104 2255051 5226337
   1833725428 2746564 4704292 2228384 5222472
   1843710162 2723900 4696425 2201804 5218521
   1853694896 2701291 4688502 2175308 5214485
   1863679631 2678736 4680525 2148894 5210367
   1873664365 2656234 4672496 2122561 5206169
   1883649099 2633783 4664415 2096307 5201891
   1893633833 2611383 4656284 2070130 5197536
   1903618568 2589032 4648103 2044029 5193106
   1913603302 2566730 4639874 2018002 5188602
   1923588036 2544475 4631597 1992047 5184025
   1933572770 2522266 4623274 1966163 5179377
   1943557504 2500104 4614905 1940349 5174659
   1953542239 2477986 4606492 1914604 5169873
   1963526973 2455911 4598035 1888925 5165020
   1973511707 2433880 4589534 1863313 5160101
   1983496441 2411891 4580992 1837765 5155118
   1993481176 2389943 4572408 1812280 5150071
   2003465910 2368036 4563783 1786857 5144962
   2013450644 2346169 4555119 1761496 5139792
   2023435378 2324341 4546415 1736194 5134562
   2033420112 2302552 4537673 1710951 5129273
   2043404847 2280801 4528893 1685767 5123927
   2053389581 2259087 4520075 1660639 5118523
   2063374315 2237409 4511221 1635567 5113063
   2073359049 2215767 4502332 1610550 5107549
   208

[R] Odp: how to deduplicate records, e.g. using melt() and cast()

2012-05-07 Thread Petr PIKAL
Hi

I wold vote aggregate

 aggregate(my.df[,-1], list(pathway=my.df$pathway), mean, na.rm=T) 
  pathway cond.one cond.two cond.three
1pw.A  0.5  0.6NaN
2pw.B  0.4  0.90.1
3pw.C  NaN  0.2NaN


Regards
Petr


 
 Esteemed UseRs,
 
 This must be embarrassingly trivial to achieve with e.g., melt() and 
 cast(): deduplicating records (pw.X in example) for a given set of 
 responses (cond.Y in example).
 
 Hopefully the runnable example shows clearly what i have and what i'm 
 trying to convert it to. But i'm just not getting it, ?cast that is! So 
 i'd really appreciate some ones patience to clarify this, using the 
 reshape package, or any other approach.
 
 With sincere thanks in advance,
 
 Karl
 
 
 ## Runnable example
 ## The data.frame i have:
 library(reshape)
 my.df - data.frame(pathway = c(rep(pw.A, 2), rep(pw.B, 3), 
 rep(pw.C, 1)),
 cond.one = c(0.5, NA, 0.4, NA, NA, NA),
 cond.two = c(NA, 0.6, NA, 0.9, NA, 0.2),
 cond.three = c(NA, NA, NA, NA, 0.1, NA))
 my.df
 ## The data fram i want:
 wanted.df  - data.frame(pathway = c(pw.A, pw.B, pw.C),
 cond.one = c(0.5, 0.4, NA),
 cond.two = c(0.6, 0.9, 0.2),
 cond.three = c(NA, 0.1, NA))
 wanted.df
 
 
 -- 
 Karl Brand
 Dept of Cardiology and Dept of Bioinformatics
 Erasmus MC
 Dr Molewaterplein 50
 3015 GE Rotterdam
 T +31 (0)10 703 2460 |M +31 (0)642 777 268 |F +31 (0)10 704 4161
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] colours in a pdf

2012-05-04 Thread Petr PIKAL
Hi

One option for substantial distinguishable range of colours is jet.colors 
from matlab package.

Regards
Petr
 

 Hi,
 
 Thanks for the help. Extending the palette to 16 or 20 would be a big 
 help. The largest number of files I've had to handle in a single group 
is 
 42 and I wouldn't expect it to get much bigger than that.
 
 I'll take a look at RColorBrewer.
 
 Cheers,
 
 Gavin.
 
 -Original Message-
 From: R. Michael Weylandt [mailto:michael.weyla...@gmail.com] 
 Sent: 04 May 2012 12:43
 To: Gavin Blackburn
 Cc: r-help@r-project.org
 Subject: Re: [R] colours in a pdf
 
 How many colors are you looking for? There are limits to how many the
 eye can make out, but perhaps the RColorBrewer package would be a
 place to start. Also check out: http://colorbrewer2.org/
 
 To see all the builtin colors, you can simply use the colors()
 function, but your viewer won't be able to distinguish most of them.
 
 Michael
 
 On Fri, May 4, 2012 at 7:39 AM, Gavin Blackburn
 gavin.blackb...@strath.ac.uk wrote:
  Hi,
 
  I'm plotting PDFs and have a problem. If I have more than 8 sources of 

 data the colours are repeated. These plots are used to remove poor data 
 from the sets so it would be helpful if I could expand the colour range. 

 Is there any way to do this?
 
  The plots are coloured by defining a vector 
colsVec-1:(length(mzXMLfiles))
 
  And defined in the loop for (file in 1:length(mzXMLfiles))
 
  as col=file
 
  The legend is then coloured using the same vector using col=colsVec
 
  Any help would be greatly appreciated. At the moment I have to break 
up 
 data into sets of 8 and hope that a large number of those are good data 
 sets so I have an aid to identify bad sets.
 
  Thanks,
 
  Gavin.
 
  Dr. Gavin Blackburn
  SULSA Technologist
 
  Strathclyde institute of Pharmacy and Biomedical Science
  161 Cathedral Street,
  Glasgow.
  G4 0RE
 
  Tel: +44 (0)1415483828
 
  ScotMet: The Scottish Metabolomics Facility
  www.metabolomics.strath.ac.ukhttp://www.metabolomics.strath.ac.uk
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help!

2012-05-03 Thread Petr PIKAL
Hi

I would convert it to propper date format and then you can extract 
anything.

dat-strptime(12/31/11 23:45, format=%m/%d/%y %H:%M)
as.Date(dat)
[1] 2011-12-31
format(dat, %H:%M)
[1] 23:45

Regards
Petr
 

 
 Hello there, I was wondering if you could help me with a quick R issue.
 
 I have a data set where one of the columns has both date and time in
 it, e.g. 12/31/11 23:45 in one cell. I want to use R to split this
 column into two new columns: date and time.
 
 One of the problems with splitting here is that when the dates go into
 single digits there are no 0's in front of months January-September
 (e.g., January is represented by 1 as opposed to 01), so every entry
 is a different length. Therefore, splitting by the space is the only
 option, I think.
 
 Here's the coding I've developed thus far:
 
 z$dt - z$Date#time and date is all under z$Date
 foo - strsplit( , z$dt) #attempted split based on the space
 
 And then if that were to work, I would proceed use the coding:
 
 foo2 - matrix(unlist(foo), ncol = 2, byrow=TRUE)
 z$Date - foo[ ,1]
 z$Time - foo[ ,2]
 
 However, foo - strsplit( , z$dt) isn't working. Do you know what
 the problem is? If you could respond soon, that would be greatly
 appreciated!
 
 Thanks so much!
 Alex
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list objects calculation

2012-05-02 Thread Petr PIKAL
Hi

you maybe can use mapply

If you have 2 lists

xl-list(x, x+5)
 xl
[[1]]
[1] 1 2 3 4 5

[[2]]
[1]  6  7  8  9 10

 yl-list(9,10)
 yl
[[1]]
[1] 9

[[2]]
[1] 10

and this function


fff- function(xl,yl) (xl-yl)/yl

mapply(fff, xl, yl)
   [,1] [,2]
[1,] -0.889 -0.4
[2,] -0.778 -0.3
[3,] -0.667 -0.2
[4,] -0.556 -0.1
[5,] -0.444  0.0

gives you probably desired result.

Regards
Petr


 
 I have two kinds of list,
 for example,  one is like
 t[[1]]=
 1 6
 2 7
 3 8
 4 9
 5 10
 ...
 t[[731]]
 the other is
 k[[1]]= 9 10
 ...
 k[[731]]
 I want to have a new list,like x
 x[[1]]=
 (1-9)/9(6-10)/10
 (2-9)/9(7-10)/10
 (3-9)/9(8-10)/10
 (4-9)/9(9-10)/10
 (5-9)/9(10-10)/10 
 ...
 x[[731]]
 How should I do?
 Thank you.
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/list-objects-
 calculation-tp4602437.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different varable lengths

2012-04-30 Thread Petr PIKAL
 Hi!
 
 I'm trying to do a lm() test on three objects. My problem is that R 
protests
 and says that the variable lengths differ for one of the objects
 (Sweden.GDP.gap). But I have double checked that the number of 
observations

You shall check it again. BTW How did you checked it?

 are the same. All three objects should contain 9 observations but R only
 accepts 9 observations in two of the objects. The third must have 10! 
Very

No, It has only 8. Why do you think it shall have 10?

 confusing because there is no 10th observation!

You are right, there is no 10th and also no 9th observation.

 
  # Adjusted Real rate - P
  Sweden.p.adjust - c(4.70243, 1.3776655, 1.117755, 1.6695175, 1.59282,
  1.1017625, -0.04295, 2.2552875, 0.0552875)
  
  # Adjusted Inflation deviation
  Sweden.infl.dev.adjust  - c(0.110382497, -0.261612509, 0.040847515,
  -0.195062497, -0.234362485, -0.023408728, 0.206421261, -0.079401261,
  0.071828752)
  
  # Adjusted GDP-gap
  Sweden.GDP.gap.adjust - c(0.673792123, 1.196706756, 1.196131539,
  0.646944002, -0.312886525, -1.180620213 -0.525964648, -0.369401194,
  -0.003280389)
  
  # OLS regression using ADJUSTED data.#
  Sweden.Taylor.real.adjust - lm(Sweden.p.adjust ~ 
Sweden.infl.dev.adjust +
  Sweden.GDP.gap.adjust)
 Error in model.frame.default(formula = Sweden.p.adjust ~
 Sweden.infl.dev.adjust +  : 
   variable lengths differ (found for 'Sweden.GDP.gap.adjust')
 
 Why is this happening?

Because your third variable has only 8 values.

Regards
Petr

 
 / Saint
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Different-
 varable-lengths-tp4597768.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting dataframe with missing values

2012-04-27 Thread Petr PIKAL
Hi


 
 Dear R-community, 
 
 I am using R (V 2.14.1) on Windows 7. I have a dataset which consists of 
19
 variables for 91 individuals or rows. Two of my variables are Age
 (adult/chick, with no NA values) and Sex (0 for females/1 for females, 
with
 quite a few NA values). The sex of many adult birds is unknown (entered 
as
 NA in dataframe). At some point of my analyses, I happen to need to need 
to
 work with only male adults, so I tried subsetting the dataframe as 
follows
 (see code below) but I get a new dataframe containing all the males but 
also
 a lot of unneeded information such as data in rows 1-7 (NAs), 13, 14, 19 
and
 21-30. I suspect this is caused by NAs in the variable Sex because
 everything goes fine (I get a dataframe containing adults) if I run the 
same
 code but without the  Data$Sex == 1 part. 
 
 How can I fix this problem? I there a straightforward way of subsetting
 efficiently when NAs are present in the original dataset? 
 Thank you so much!

I usually do it in 2 lines

selection- which(Data$Category == Adult  Data$Sex == 1)
Data[selection, ]

could be what you want.

Or you can do

adult.males - adult.males[!is.na(adult.males$Sex),]

Regards
Petr


 
 Luciano 
 
 adult.males - Data[Data$Category == Adult  Data$Sex == 1,] 
 adult.males
 
  ID Category Sex  Beak   Head 
 NA NA NA  NANA NA 
 NA.1   NA NA  NANA NA 
 NA.2   NA NA  NANA NA 
 NA.3   NA NA  NANA NA 
 NA.4   NA NA  NANA NA 
 NA.5   NA NA  NANA NA 
 NA.6   NA NA  NANA NA 
 9 LAA10Adult   1 57.40 121.95 
 10LAA11Adult   1 56.40 113.00 
 11LAA12Adult   1 52.00 111.85 
 13LAA14Adult   1 56.55 124.85 
 15LAA16Adult   1 57.15 120.10 
 NA.7   NA NA  NANA NA 
 NA.8   NA NA  NANA NA 
 21LAA22Adult   1 56.85 117.35 
 22LAA23Adult   1 54.80 117.45 
 27LAA28Adult   1 59.00 116.75 
 28LAA29Adult   1 55.95 124.25 
 NA.9   NA NA  NANA NA 
 30LAA31Adult   1 57.70 112.80 
 NA.10  NA NA  NANA NA 
 NA.11  NA NA  NANA NA 
 NA.12  NA NA  NANA NA 
 NA.13  NA NA  NANA NA 
 NA.14  NA NA  NANA NA 
 NA.15  NA NA  NANA NA 
 NA.16  NA NA  NANA NA 
 NA.17  NA NA  NANA NA 
 NA.18  NA NA  NANA NA 
 NA.19  NA NA  NANA NA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rconsole file fails to remember GUI settings, and script

2012-04-27 Thread Petr PIKAL
Hi

Isn't possible just to change values in etc/Rconsole file?

## Colours for console and pager(s)
# (see rw/etc/rgb.txt for the known colours).
background = White
normaltext = NavyBlue
usertext = Red
highlight = DarkRed

You can customise it and keep your setting somewhere. With each new 
installation just copy appropriate part aof Rconsole file and save it.

Regards
Petr

 
 Sorry, I don't use Windows... can't help there.
 
 It's customary to keep cc'ing the list on these sorts of things so
 someone with a more similar platform/ more expertise can take the
 issue up if needed. Maybe someone else will be able to help you out.
 
 Michael
 
 On Fri, Apr 27, 2012 at 9:05 AM, geo theory geotheo...@gmail.com 
wrote:
  Hi there
 
  Window 7. Sorry should've mentioned that.
 
  On Fri Apr 27, 2012 at some time, Michael Weylandt wrote:
  The GUI's are OS specific -- which is yours?
 
  You might also want to check out RStudio -- also open source -- which
  provides a very nice cross-platform IDE: http://rstudio.org/
 
  There are nice coloration options (both customizable and by default)
  -- I quite like the cobalt theme
 
  Michael
 
  On Fri, Apr 27, 2012 at 7:04 AM, geotheory geotheo...@gmail.com 
wrote:
  Am encountering two related problems since the 2.15 release. 
=C2=A0Apolog=
  ies in
  advance for a mundane non code-related post, but you know how it is. 
=C2=
  =A0I'm
  using the basic R GUI.
 
  1: I use a black environment for no glare, so pre-version 2.15 I've 
had
  black backgrounds and white text. =C2=A0Since the 2.15 release my 
script =
  window
  text has colour turned itself black (so is now invisible) and now I 
canno=
  t
  find any setting in the options list that changes it. =C2=A0Anyone 
know w=
  hich one
  its supposed to be?
 
  2: I initially changed the script window to mid grey (so I could see 
the
  text) and saving to the standard 'Rconsole' file, but now R fails to
  remember that setting when it reloads. =C2=A0So I've tried resetting 
the =
  GUI
  settings by deleting all 'Rconsole' files (found 1 on C drive and the 
one=
   in
  my personal R folder) but to no avail. It just doesn't remember new
  settings, and its unclear where R is getting its current GUI settings 
fro=
  m.
  R!
 
  More generally, can't someone behind this otherwise great project 
please
  update the GUI settings to make them more comprehensive. =C2=A0And 
user-f=
  riendly
  - the 'console and pager colours' section is abysmal.
 
  --
  View this message in context: 
http://r.789695.n4.nabble.com/Rconsole-file=
  
-fails-to-remember-GUI-settings-and-script-window-text-colour-option-is-mis=
  sing-tp4592349p4592349.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.h=
  tml
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge function - Return NON matches

2012-04-27 Thread Petr PIKAL
Hi

If you used shorter names for your objects you will get probably more 
readable advice

Is this what you wanted?

truncated_dataframe[truncated_dataframe$CLAIM_NO %in% 
setdiff(truncated_dataframe$CLAIM_NO, truncated_list$CLAIM_NO),]

Regards
Petr

 
 Hi there,
 I've tried the noted solutions:
 
 If you do `no - unlist(hrc_78_clm_no`, do you get a character vector 
 of claim numbers you want to exclude? If so, then `subset(whatever, 
 !CLAIM_NO %in% no)` should work.
 
 I converted the CLAIM_NO list to a character, with
 
  hrc78_clmno_char - format(as.character(hrc78_clm_no))
  is.character(hrc78_clmno_char)
 [1] TRUE
 
 Then I applied your code (above), which didn't work.  Thanks though!
 
 Thanks for the dput() help.  Here is truncated output of the list (its 
class
 is data.frame, I call it a list for communication sake)  data.frame. 
 Again, your help is most appreciated!
 
 Goal: merge the list  data.frame together.  Output the data.frame, but 
with
 rows where the CLAIM_NO variable between the list  data.frame *do not
 match*.
 
 *The List*
 truncated_list - hrc78_clm_no[1:100,] #So you can see consistency in
 previously-mentioned variables
 truncated_list - structure(list(CLAIM_NO = c(20L, 83L, 1440L, 4439L, 
7002L,
 9562L, 10463L, 12503L, 16195L, 
 22987L, 30760L, 32108L, 32640L, 33045L, 36241L, 37091L, 37934L, 
 38663L, 39456L, 40544L, 40630L, 40679L, 40734L, 43054L, 53483L, 
 54155L, 56151L, 58113L, 61050L, 62056L, 63014L, 68486L, 68541L, 
 69298L, 69983L, 73379L, 76810L, 79975L, 91124L, 97697L, 100524L, 
 105808L, 112659L, 112955L, 113422L, 114522L, 124159L, 133566L, 
 135167L, 137387L, 137954L, 138186L, 144574L, 148573L, 150013L, 
 152193L, 154680L, 155414L, 165954L, 171223L, 175077L, 176359L, 
 177656L, 178155L, 182250L, 182393L, 182832L, 184245L, 185542L, 
 186038L, 186087L, 186098L, 186294L, 186550L, 186897L, 187025L, 
 190180L, 191472L, 192593L, 196207L, 196689L, 197372L, 197537L, 
 197590L, 197730L, 197874L, 198294L, 198750L, 198823L, 199076L, 
 199233L, 199284L, 199468L, 199661L, 199913L, 200150L, 200279L, 
 200473L, 200927L, 202407L), .Names = c(CLAIM_NO), class = 
data.frame))
 
 *The (multi-column) data.frame, but greatly truncated*
 truncated_dataframe - bestPartAreadmin[1:25, 1:4]
 truncated_dataframe - structure(list(DESY_SORT_KEY = c(10193L,
 10193L, 10193L, 
 10574L, 10574L, 19213L, 19213L, 19213L, 100026636L, 
 100040718L, 100055111L, 100060558L, 100060558L, 100060558L, 100072978L, 
 100096346L, 100130451L, 100168782L, 100168782L, 100168782L, 100168782L, 
 100168782L, 100168782L, 100174887L, 100177905L), PRVDR_NUM =
 structure(c(1368L, 
 1353L, 1406L, 149L, 149L, 1362L, 1393L, 1367L, 1557L, 1370L, 
 1360L, 1362L, 1362L, 1362L, 1372L, 1358L, 193L, 196L, 196L, 61L, 
 166L, 196L, 196L, 311L, 1363L), .Label = c(010001, 010006, 
 010015, 010016, 010029, 010033, 010034, 010035, 010039, 
 010040, 010046, 010049, 010083, 010092, 010108, 010131, 
 010149, 01S001, 01S033, 01S046, 01S145, 020001, 020006, 
 020012, 020017, 021306, 021311, 030002, 030006, 030007, 
 030010, 030011, 030012, 030013, 030014, 030016, 030023, 
 030024, 030030, 030033, 030036, 030037, 030038, 030043, 
 030055, 030061, 030062, 030064, 030065, 030067, 030069, 
 030078, 030083, 030085, 030087, 030088, 030089, 030092, 
 030093, 030100, 030101, 030102, 030103, 030105, 030108, 
 030110, 030111, 030114, 030115, 030117, 030118, 030119, 
 030120, 030121, 030122, 030123, 030126, 030128, 031300, 
 031305, 031311, 032000, 032001, 032002, 032006, 033025, 
 033028, 033029, 033032, 033034, 033036, 034004, 034013, 
 034020, 034024, 03S002, 03S006, 03S007, 03S016, 03S022, 
 03S023, 03S089, 03T002, 03T055, 03T061, 03T069, 03T093, 
 03T103, 03T114, 03T117, 03T126, 040004, 040007, 040010, 
 040011, 040016, 040022, 040026, 040027, 040029, 040036, 
 040041, 040047, 040055, 040062, 040072, 040080, 040084, 
 040088, 040091, 040114, 040118, 040119, 043028, 044005, 
 04S027, 04S084, 04T041, 04T062, 04T119, 050002, 050006, 
 050007, 050008, 050009, 050013, 050014, 050016, 050017, 
 050018, 050022, 050024, 050025, 050026, 050030, 050036, 
 050038, 050039, 050040, 050042, 050043, 050045, 050046, 
 050047, 050055, 050056, 050057, 050058, 050060, 050063, 
 050069, 050070, 050071, 050073, 050075, 050076, 050077, 
 050078, 050079, 050082, 050084, 050089, 050090, 050091, 
 050093, 050099, 050100, 050101, 050102, 050103, 050104, 
 050107, 050108, 050110, 050111, 050112, 050113, 050115, 
 050116, 050118, 050121, 050122, 050124, 050125, 050126, 
 050128, 050129, 050131, 050132, 050133, 050135, 050136, 
 050137, 050138, 050139, 050140, 050145, 050146, 050149, 
 050150, 050152, 050153, 050158, 050159, 050168, 050169, 
 050174, 050179, 050180, 050188, 050191, 050193, 050195, 
 050196, 050197, 050204, 050211, 050219, 050222, 050224, 
 050225, 050226, 050228, 050230, 050231, 050232, 050234, 
 050235, 050236, 050238, 050239, 050242, 050243, 050245, 
 050248, 050254, 050257, 050261, 050262, 050264, 050272, 
 050276, 

Re: [R] repeat matrix rows as a whole

2012-04-26 Thread Petr PIKAL
Hi

what about

 rbind(a,a)
 [,1] [,2] [,3] [,4]
[1,]1234
[2,]5678
[3,]1234
[4,]5678

Regards
Petr


 
  a - matrix(1:8, 2, 4, byrow=TRUE)
  a
  [,1] [,2] [,3] [,4]
 [1,]1234
 [2,]5678
  a[c(1,2,1,2),]
  [,1] [,2] [,3] [,4]
 [1,]1234
 [2,]5678
 [3,]1234
 [4,]5678
 
 
 On Wed, Apr 25, 2012 at 8:32 PM, Rebecca rebecca0...@yahoo.cn wrote:
 
  Hi,
  If I have a matrix like
  1  2  3  4
  5  6  7  8,
  how can I repeat two rows as whole, to be like
  1  2  3  4
  5  6  7  8
  1  2  3  4
  5  6  7  8?
 
  Since I have more two rows in a matrix and I need to repeat many 
times, I
  wonder whether there is a convenient command to do so.
 
  Thanks!
 [[alternative HTML version deleted]]
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/
 posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble with [sv]apply

2012-04-20 Thread Petr PIKAL
Hi
 
 On Fri, Apr 20, 2012 at 4:42 PM, Jeff Newmiller
 jdnew...@dcn.davis.ca.us wrote:
  If you read the help, it talks about compiling vectors into matrices, 
or
 scalars into vectors. It does not say anything about combining matrices.
 
  For the error about 14 elements, you should keep in mind that matrices 

 are just vectors with dim attributes that indicate how the linear memory 

 is to be folded.
 
  As far as I know, the standard way to handle combining matrices as you 

 want to would involve storing them in a list and using Reduce and rbind. 

 If you can vectorize the whole process instead of segmenting it by 
groups 
 of rows then you can speed things up considerably.
 
 Thank you.  That was helpful.  I did read the help on [vsl]apply.  But
 the idea that matrices are folded vectors was, of course not there.

Here is Details section of help page for array

An array in R can have one, two or more dimensions. It is simply a vector 
which is stored with additional attributes giving the dimensions 
(attribute dim) and optionally names for those dimensions (attribute 
dimnames). 

and chapter 2.8 of R - intro
2.8 Other types of objects
Vectors are the most important type of object in R, but there are several 
others which we will
meet more formally in later sections.
 matrices or more generally arrays are multi-dimensional generalizations 
of vectors. In fact,
they are vectors that can be indexed by two or more indices and will be 
printed in special
ways. See Chapter 5 [Arrays and matrices], page 18.

Quite short and usefull chapter.

Regards
Petr



 
 Worik
 
  
---
  Jeff NewmillerThe .   .  Go 
Live...
  DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live 
Go...
   Live:   OO#.. Dead: OO#.. 
 Playing
  Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
  /Software/Embedded Controllers)   .OO#.   .OO#. 
 rocks...1k
  
---
  Sent from my phone. Please excuse my brevity.
 
  Worik R wor...@gmail.com wrote:
 
 Friends
 
 I clearly donot understand how sapply and vapply work.
 
 What I have is a function that returns a matrix with an indeterminate
 number of rows (some times zero) but a constant number of columns.  I
 cannot reliably use an apply function to assemble the matrices into a
 matrix.  I am not sure it is possible.
 
 I can demonstrate the core of my confusion with this simple code.
 
 A.f - function(i){
   ret - matrix(a, i, 7)
   cat(i, class(ret), dim(ret), \n)
   return(ret)
 }
 V.f - function(){
   SS - vapply(c(1,2),
  A.f,
  rep('a', 7))
   return(SS)
 }
 S.f - function(){
   SS - sapply(c(1,2),
  A.f)
   cat(SS, class(SS), dim(SS), \n)
   return(SS)
 }
 
 
 Calling V.f() fails:
 
  V.f()
 1 matrix 1 7
 2 matrix 2 7
 Error in vapply(c(1, 2), A.f, rep(a, 7)) :
   values must be length 7,
  but FUN(X[[2]]) result is length 14
 
 
 
 Calling S.f() returns a list.
 
 
 Do I have to accept I am going to be getting a list and I have to
 assemble a matrix in a loop?
 
 cheers
 Worik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re : Sort out number on value

2012-04-20 Thread Petr PIKAL
Hi

Without data it is only a guess.
 
 I found out something strange when I used the same thing on another data
 file. 
 
 In a excel file I have same data too and there I asked in a certain 
column
 what values where above the 7.5. Result: 206. 
 Now I have done the same thing in R and I get as result: 400. 

I was tempted to assign it to FAQ 7.31 about finite precision of decimals. 
However such a big difference is rather strange.

 
 My code: 
 # Find values above 7.5. 
 C = x[x = 7.5]
 
 # Calculate length. 
 number = length[C]

Shouldn't it be length(C)? 

 
 # print outcome
 number 

I personally suspect NA values which are retained after subsetting. Output 
from

str(your object)

could help.

Maybe you want only number of values. For that you shall to use

sum(!is.na(C))

But maybe I am on a wrong track.

Regards
Petr


 
 Or could it have something to do that I have calculated the log2 value 
from
 a value dat was like: 7.84394 -01 However I don't think that was the 
problem
 since I cheaked the first 10 log2 values, and they where correct. 
 
 I just don't understand why the answer is now higer, 400 and not 
206. 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Sort-out-
 number-on-value-tp4573467p4573764.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help in using unique count by match function

2012-04-19 Thread Petr PIKAL
Hi

Your question is rather cryptic. Why the output shall be 3? What has 
unique count to do with match function?

Maybe you want something what is described in switch help.

See

?switch

Regards
Petr

 
 Hi
 
 My code looks like this
 
 I have two parameters x and par1. X contains values and par1 contains 
the
 function which i required to  use
 
 if par1 is max then output should be max(x). 
 
   FUN - match.fun(par1)
   result=FUN(x)
 
 Is it possible to incorporate the unique count of x within this code
 
 eg
 
 x=(a,b,a,c)  . The output should be 3 
 
 -
 Thanks in Advance
 Arun
 --
 View this message in context: 
http://r.789695.n4.nabble.com/Help-in-using-
 unique-count-by-match-function-tp4569859p4569859.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loss of information in pdf plots

2012-04-17 Thread Petr PIKAL
Hi

 
 Hi there
 
 is it possible that pdfs generated using the pdf() function with default 

 settings leads to loss of information? I was plotting copy number 
changes 
 from Agilent 180k data in form of rectangles (rect()) while each 
rectangle
 represents one region of copy number change. When plotting into a pdf I 
 noticed that some very small rectangles do not appear (even after 
 extensive zooming) in the pdf using the pdf() function. But they do when 

 writing the screen output into a pdf using the GUI. Does anyone have 
some 
 advice on this how I can plot pdfs without losing information?

I am not sure if it has something to do with dingbats but you could try 
it.

Set

useDingbats=FALSE

in you pdf call.

Regards
Petr

 
 Best wishes
 
 Kristian
 
 R version 2.14.0 (2011-10-31)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
 
 locale:
 [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 other attached packages:
 [1] CGHregions_1.12.0 CGHcall_2.14.0CGHbase_1.12.0marray_1.32.0
 [5] limma_3.10.3  Biobase_2.14.0DNAcopy_1.28.0impute_1.28.0
 
 loaded via a namespace (and not attached):
 [1] tools_2.14.0
 
 
 
 Arbeitsgruppenleiter Integrative Biologie / Head of Integrative Biology 
Group
 Abteilung für Strahlenzytogenetik / Research Unit of Radiation 
Cytogenetics
 
 Tel.: +49-89-3187-3515
 
 
 
 Helmholtz Zentrum München
 Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)
 Ingolstädter Landstr. 1
 85764 Neuherberg
 www.helmholtz-muenchen.de
 Aufsichtsratsvorsitzende: MinDir´in Bärbel Brumme-Bothe
 Geschäftsführer: Prof. Dr. Günther Wess und Dr. Nikolaus Blum
 Registergericht: Amtsgericht München HRB 6466
 USt-IdNr: DE 129521671
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to change color of bar based on y value of y axis?

2012-04-17 Thread Petr PIKAL
Hi

 
 Hi, 
 
 
 I m working on bar chart. 
 
 *Input file:*
 index   -5   1
 index   -4   3
 index   -3   2
 index   -2   10
 index   -1   7
 index   0   2
 index   1   1
 
 barplot(t(as.matrix(i[3])), ylab= value, main = testdata, 
beside=TRUE,
 
col=c(burlywood1),horiz=TRUE,cex.names=0.8,names.arg=t(as.matrix(i[,2])))

You still failing to provide reproducible examples.

Maybe you want this.

x-sample(1:10, 5)
barplot(x, col=c(burlywood1))
set.seed(333)
y-sample(-3:0, 5, replace=T)
barplot(x, col=c(burlywood1, red)[(y==-3)+1])

Regards
Petr

 
 But i need to change the color of bar where value of  y = -3 
dynamically.
 How can i implement it. 
 
 Regards http://r.789695.n4.nabble.com/file/n4566636/Screenshot.png 
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/How-to-change-
 color-of-bar-based-on-y-value-of-y-axis-tp4566636p4566636.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to keep spacing in column name while reading data from data frame?

2012-04-17 Thread Petr PIKAL
Hi

 
 Hi, 
 
  I am working on dataframe and column names are multiwords but when i 
read
 it it become one word with space relplaced by .  How can i keep 
normall
 spacing reading file. 
 
 for e.g. 
 
 input file
 
 Data Test   Data Out
 35
 54
 
 But when i read this data in table it becomes
 Data.TestData.Out
 35
 54
 
 Pls help me out. Thanks

You need to change check.names parameter when reading in your file.

Regards
Petr


 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/How-to-keep-
 
spacing-in-column-name-while-reading-data-from-data-frame-tp4566585p4566585.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] row.names in dunes and dunes.env?

2012-04-12 Thread Petr PIKAL
Hi

see inline

 
 Hello,
 
 I've got a small dataset on box turtle shell measurements that I would 
 like to perform a detrended correspondence analysis on. I thought that 
it 
 would be interesting to examine the morphometrics for each species in 
the 
 area of overlap and in areas where neither species occurs. 
 
 I've taken a look at the dune and dune.env datasets in vegan. Using the 
 str() command gives me 
 
  str(dune)
 'data.frame':   20 obs. of  30 variables:
  $ Belper: num  3 0 2 0 0 0 0 2 0 0 ...
  $ Empnig: num  0 0 0 0 0 0 0 0 0 0 ...
  $ Junbuf: num  0 3 0 0 0 0 0 0 0 0 ...
  $ Junart: num  0 0 0 3 0 0 4 0 0 3 ...
  ...
 
 However, when I try looking directly at the data frame using the edit 
 command I see that there is a column called row.names to the left of 
Belper.
 
 Likewise, when I use the str() command on dune.env I get
 
  str(dune.env)
 'data.frame':   20 obs. of  5 variables:
  $ A1: num  3.5 6 4.2 5.7 4.3 2.8 4.2 6.3 4 11.5 ...
  $ Moisture  : Ord.factor w/ 4 levels 1245: 1 4 2 4 1 1 4 1 2 
4 ...
  $ Management: Factor w/ 4 levels BF,HF,NM,..: 1 4 4 4 2 4 2 2 3 3 
...
  $ Use   : Ord.factor w/ 3 levels HayfieldHaypastu..: 2 2 2 3 
2 
 2 3 1 1 2 ...
  $ Manure: Ord.factor w/ 5 levels 0123..: 3 4 5 4 3 5 4 
3 1 1 ...
 
 but using the edit() command shows a column named row.names.

No. This is not a column but it is what it says row.names

 str(rosin)
'data.frame':   10 obs. of  5 variables:
 $ pytel: int  1 2 3 4 5 6 7 8 9 10
 $ rstr : num  1.022 0.981 0.992 1.01 0.976 ...
 $ gama : num  1.4 1.44 1.41 1.43 1.39 ...
 $ cas  : int  0 3 6 9 12 15 18 21 24 27
 $ typ  : chr  anatas anatas anatas anatas 


 head(rosin)
  pytel  rstr gama castyp
1 1 1.0216621 1.397885   0 anatas
2 2 0.9809663 1.442439   3 anatas
3 3 0.9916211 1.411767   6 anatas
^^ these are row names

 
 I assume that the the row.names column is used to link the two files 
together.

If you are in doubt, recommended way is to consult documentation.

?row.names
All data frames have a row names attribute, a character vector of length 
the number of rows with no duplicates nor missing values. 

 
 My turtle data is saved as a *.csv, and I've added a column called 
 row.names, so that it looks like this
 
 row.names,CL,CCL,CW,CCW,CH,CCH
 1,104.4,131.8,89.887,137.4,43.391,89.7
 2,108.79,135.9,87.78,118.1,50.72,71.2
 3,114.12,126.1,89.33,132.8,142.39,78.3
 4,102.87,128.2,84.2,125,45.42,72.4
 5,84.6,104.8,72.61,111.8,41.1,57.3
 
 I've called this file turtles_dca.csv. I've also created a file called 

 turtles_dca_env.csv that looks like this
 
 row.names,Species,Sex,Distribution,Concatenated,Species_overlap
 1,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
 2,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
 3,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
 4,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
 5,Terrapene_ornata,Female,overlap,TO_F_Overlap,TO_Overlap
 
 However, when I read the data into R using this command
 
 turtles.env = read.csv(turtles_dca_env.csv, header = TRUE)
 
 
 and then using the str() command I get 
 
 
  str(turtles)
 'data.frame':   67 obs. of  7 variables:
  $ row.names: int  1 2 3 4 5 6 7 8 9 10 ...
  $ CL   : num  104.4 108.8 114.1 102.9 84.6 ...
  $ CCL  : num  132 136 126 128 105 ...
  $ CW   : num  89.9 87.8 89.3 84.2 72.6 ...
  $ CCW  : num  137 118 133 125 112 ...
  $ CH   : num  43.4 50.7 142.4 45.4 41.1 ...
  $ CCH  : num  89.7 71.2 78.3 72.4 57.3 73.4 67 57 68.8 68 ...
 
 When I run decorana() on this dataset, it appears that the column 
 row.names is included in the analysis, which isn't what I'm looking 
for. 

Then why you added this column to your data?

 
 If I go ahead and delete the column row.names from my data frames 
(i.e. 
 removing it from turtles and turtles.env), I don't believe that the 
 analysis is performed correctly. The two species differ significantly in 

 most of their measurements, but the ordihull() and ordispider() commands 

 show them overlapping almost completely.
 
 I think that I'm missing something pretty basic about inputting and 
 formatting this data for this analysis. Can anyone offer a suggestion on 

 where I'm going astray? I can send a copy of the data if anyone wants to 
look at it.

I am not familiar with functions you use. However you probably want to 
link those 2 files together. If they both are in the same order you can 
just do

turtles.complet - cbind(turtles, turtles.env)

Or if they are in different order you need to find some common column(s) 
and 

?merge

those two files.

Regards 
Petr


 
 Best wishes,
 Chris
 University of Central Oklahoma
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, 

Re: [R] number of warnings

2012-04-12 Thread Petr PIKAL
Hi
 
 Any help ?

Call 999

Regards
Petr

 
 --
 View this message in context: http://r.789695.n4.nabble.com/number-of-
 warnings-tp4550325p4551760.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selective labels display on histogram

2012-04-12 Thread Petr PIKAL
Hi

 
 Hello,
 Is it possible to selectively display labels on a histogram?

What labels?

Like that?

x-rnorm(1)
hist(x)
hist(x, axes=F, xlab=bla, ylab=ble, main=bleble)
axis(1, at=c(-4, -1, 1, 4))

Regards
Petr

 
 Thanks
 
 Carol
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selective labels display on histogram

2012-04-12 Thread Petr PIKAL
Hi

 
 No, data labels on the histogram bars. labels = T in hist displays all 
data labels.

You could find it probably quicker in documentation.

Plotting command usually creates (invisibly) the object which can be saved 
and changed.

h-hist(x)

h is list which you can use or modify

for instance this

h.lab - h$counts
h.lab[seq(2,16,2)]-NA
hist(x, labels=as.character(h.lab))

prints every second label

Regards
Petr


 
 thanks
 
 From: Petr PIKAL petr.pi...@precheza.cz
 To: carol white wht_...@yahoo.com 
 Cc: r-h...@stat.math.ethz.ch r-h...@stat.math.ethz.ch 
 Sent: Thursday, April 12, 2012 4:03 PM
 Subject: Re: [R] selective labels display on histogram
 
 Hi
 
  
  Hello,
  Is it possible to selectively display labels on a histogram?
 
 What labels?
 
 Like that?
 
 x-rnorm(1)
 hist(x)
 hist(x, axes=F, xlab=bla, ylab=ble, main=bleble)
 axis(1, at=c(-4, -1, 1, 4))
 
 Regards
 Petr
 
  
  Thanks
  
  Carol
  
 [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help Using Spreadsheets

2012-04-10 Thread Petr PIKAL
Hi

 
 You might want to re-read the Intro to R and the section on
 dataframes.  Your spreadsheet is read into R as a dataframe which is
 very similar to an Excel spreadsheet.  Exactly what problem are you
 having with it?  Is it trying to access the data?
 
 2012/4/6 Pedro Henrique lama...@superig.com.br:
  Hi, Petr,
  Thanks for answering.
  Yes, I do read the file with the read.xls command but I do not know 
how to
  read it into an object.

What was the result of reading the file by read.xls? AFAIK Any read.* 
function does read some data either to console (and print them) or to some 
object for further use.

mydata - read.xls(something)

  I read the R-into document chapter of objects, but I is still not 
clear for
  me how to transform this kind of data into an object.

My mind reading ability is rather undeveloped so if you

PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

e.g. **how** did you read your file to R.

Regards
Petr

 
  Regards,
 
  Lämarao
 
 
  - Original Message - From: Petr PIKAL 
petr.pi...@precheza.cz
  To: Pedro Henrique lama...@superig.com.br
  Cc: r-help@r-project.org
  Sent: Friday, April 06, 2012 6:27 AM
  Subject: Hi: [R] Help Using Spreadsheets
 
 
  Hi
 
  Hello,
 
  I am a new user of R and I am trying to use the data I am reading 
from a
 
 
  spreadsheet.
  I installed the xlsReadWrite package and I am able to read data from
 
  this
 
  files, but how can I assign the colums into values?
  E.g:
  as I read a spreadsheet like this one:
 
 
  Maybe with read.xls? Did you read it into an object?
 
  A B
  1 2
  4 9
 
  I manually assign the values:
  A-c(1,4)
  B-c(2,9)
 
 
  Why? If you read in to an object (e.g. mydata)
 
 
 
  to plot it on a graph:
  plot(A,B)
 
 
  plot(mydata$A, mydata$B)
 
 
 
  or make histograms:
  hist(A)
 
 
  hist(mydata$A)
 
 
  But actualy I am using very large colums, does exist any other way to 
do
 
 
  it automatically?
 
 
  Yes. But before that you shall automatically read some introduction
  documentation like R-intro)
 
  Regards
  Petr
 
 
  Best Regards,
 
  Lämarăo
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 
  http://www.R-project.org/posting-guide.html
 
  and provide commented, minimal, self-contained, reproducible code.
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 Jim Holtman
 Data Munger Guru
 
 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: how to plot 2d matrix as coloured squares?

2012-04-10 Thread Petr PIKAL
Hi

 
 i have a matrix like 
 
 x1  x2  x3
 y1  2   34  5656
 y2  34  434 342
 y3  234 43  34
 
 i want to plot these values like here
 http://www.almob.org/content/2/1/12/figure/F5?highres=y
 
 The rainbow function could calculate a colour for each value.
 But how van i generate the square pattern?

?image, ?filled.contour

Petr

 
 Kind regards,
 
 -- 
 Jonas Stein n...@jonasstein.de
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to save multiple work space

2012-04-10 Thread Petr PIKAL
Hi

 
 You'll need to save them manually to avoid name conflicts -- 
save.image() 
 is the function to do so but you need to give a file name. 

Or it is necessary have separate folder for each R session.

Regards
Petr 

 
 Michael
 
 On Apr 10, 2012, at 7:41 AM, ya xinxi...@163.com wrote:
 
  Hi guys,
  
  I have a question. I am running 3 R sessions simultaneously for 
 different analysis. I found out that when R quit, only objects in one of 

 these sessions was saved in the work space. How can I save objects of 
all 
 3 R sessions?
  
  Thank you very much.
  
  YA
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Legend based on levels of a variable

2012-04-06 Thread Petr PIKAL
Hi

 
 I have a bivariate plot of axis2 against axis1 (data below). I would 
like
 to use different size, type and color for points in the plot for the 
point
 coming from different region. For some reasons, I cannot get it done. 
Below
 is my code.
 
 col - rep(c(blue, red, darkgreen), c(16, 16, 16))
 ## Choose different size of points
 cex - rep(c(1, 1.2, 1), c(16, 16, 16))
 ## Choose the form of the points (square, circle, triangle and
 diamond-shaped
 pch - rep(c(15, 16, 17), c(16, 16, 16))
 
 plot(axis1, axis2, main=My plot, xlab=Axis 1, ylab=Axis 2,
  col=c(Category, col), pch=pch, cex=cex)
 legend(4, 12.5, c(NorthAmerica, SouthAmerica, Asia), col = col,
pch = pch, pt.cex = cex, title = Region)
 
 I also prefer a control on what kind of point I want to use for 
different
 levels of Region. Something like this:
 legend(4,12.5, col(levels(Category), Asia=red, NorthAmerica=blue,
 SouthAmerica=green))

So why you do not use Region and/or Category for automatic point 
colouring/size/type.

Without data I can only use built in one.

with(iris, plot(Sepal.Length, Sepal.Width, col= as.numeric(Species)))
legend(topright, legend=levels(iris$Species), pch=19, col=1:3)

Regards
Petr

 
 Thanks,
 Kumar
 
   Region axis1 axis2  NorthAmerica 5 14  NorthAmerica 8 13  NorthAmerica 
8
 11  NorthAmerica 6 11  NorthAmerica 5 13  SouthAmerica 8 17 SouthAmerica 
7
 16  SouthAmerica 7 13  SouthAmerica 8 14  SouthAmerica 6 17  Asia 7 13 
Asia
 6 15  Asia 7 14  Asia 5 13  Asia 4 16
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hi: Help Using Spreadsheets

2012-04-06 Thread Petr PIKAL
Hi
 Hello,
 
 I am a new user of R and I am trying to use the data I am reading from a 

 spreadsheet.
 I installed the xlsReadWrite package and I am able to read data from 
this 
 files, but how can I assign the colums into values?
 E.g:
 as I read a spreadsheet like this one:

Maybe with read.xls? Did you read it into an object?

 A B
 1 2
 4 9
 
 I manually assign the values:
 A-c(1,4)
 B-c(2,9)

Why? If you read in to an object (e.g. mydata)


 
 to plot it on a graph:
 plot(A,B)

plot(mydata$A, mydata$B)


 
 or make histograms:
 hist(A)

hist(mydata$A)

 
 But actualy I am using very large colums, does exist any other way to do 

 it automatically?

Yes. But before that you shall automatically read some introduction 
documentation like R-intro)

Regards
Petr

 
 Best Regards,
 
 Lämarăo
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to do piecewise linear regression in R?

2012-04-06 Thread Petr PIKAL
Hi

Your post is rather screwed.

 
 [R] how to do piecewise linear regression in R?

Maybe segmented?

Regards
Petr

 
 
 Dear all,
 I want to do piecewise CAPM linear regression in R:
 RRiskArb−Rf  = (1−δ)[αMktLow+βMktLow(RMkt−Rf)]  +  δ[αMkt 
High 
 +βMkt High(RMkt −Rf )]
 
 where δ is a dummy variable if the excess return on the value-weighted 
 CRSP index is above a threshold level and zero otherwise. and at the 
same 
 time add the restriction:
 
 αMkt Low + βMkt Low · Threshold = αMkt High + βMkt High · 
Threshold
 to ensure continuity.
 But I do not know how to add this restriction in R, could you help me on 
this?
 Thanks a lot!
 Eunice 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Legend based on levels of a variable

2012-04-06 Thread Petr PIKAL
Thanks, 

anyway, using build-in R features is preferable for colours

with(data, plot(axis1, axis2, col= c(red, blue, 
green)[as.numeric(data$Region)]))
legend(topright, legend=levels(data$Region), fill= c(red, blue, 
green))

although sometimes can be preferable to get advantage of grid graphic

library(ggplot2)
p-ggplot(data, aes(x=axis1, y=axis2, colour=Region))
p+geom_point()

Regards
Petr

 
 He provided data, yet in an inconvenient way at the bottom of his post.
 
 Kumar, please use dput() to provide data to the list, because its much
 easier to import:
 dput(data)## name data is made up by me
 
 structure(list(Region = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 
 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L), .Label = c(Asia, NorthAmerica, 
 SouthAmerica), class = factor), axis1 = c(5L, 8L, 8L, 6L, 
 5L, 8L, 7L, 7L, 8L, 6L, 7L, 6L, 7L, 5L, 4L), axis2 = c(14L, 13L, 
 11L, 11L, 13L, 17L, 16L, 13L, 14L, 17L, 13L, 15L, 14L, 13L, 16L
 )), .Names = c(Region, axis1, axis2), class = data.frame, 
row.names = c(NA, 
 -15L))
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R generated means are different from the boxplot!

2012-04-06 Thread Petr PIKAL
Hi
 
 Hi R-listers, 
 
 1) I am having trouble understanding why the means I have calculated 
from
 Aeventexhumed (A, B, and C) are different from the means showing on the
 boxplot I generated (see attached).  I have added the script as to how 
my
 data is organized. 

Maybe the difference is that boxplot show medians.

 
 2) Also when I went through the data manually the means I calculated for
 each nesting event are slightly different than what I generated through 
R
 (see below). 
 A,  B,  C
 0.2155051, 0.1288241, 0.1124618

Without data it is difficult to guess. Usually when manually computed 
value is different you made manual mistake. I would hardly believe that R 
can be wrong in such a simple and extensively used function.

Regards
Petr

 
 Thanks in advance,
 
 Jean
 
 ---
  require(plyr)
 Loading required package: plyr
  resp - read.csv(file.choose())
  envir - read.csv(file.choose())
  resp - resp[!is.na(resp$Aeventexhumed), ]
  resp$QuadratEvent - paste(resp$QuadratID, resp$Aeventexhumed, sep=)
  resp$QuadratEvent - as.character(resp$QuadratEvent)
  envir - envir[!is.na(envir$Aeventexhumed), ]
  envir$QuadratEvent - paste(envir$QuadratID, envir$Aeventexhumed, 
sep=)
  envir$QuadratEvent - as.character(envir$QuadratEvent)
  ExDate - Sector - Quadrat - Aeventexhumed - NULL
  ST1 - ST2 - ST3 - ST4 - ST0 - NULL
  Shells - Hatchlings - MaxHatch - DeadHatch - NULL
  Oldeggs - TotalEggs - QuadratEvent - NULL
  for (q in unique(as.character(resp$QuadratEvent))) {
 + s - resp[as.character(resp$QuadratEvent) == q, ]
 + ExDate - c(ExDate, as.character(s$ExDate[1]))
 + Sector - c(Sector, as.character(s$Sector[1]))
 + Quadrat - c(Quadrat, as.character(s$Quadrat[1]))
 + Aeventexhumed - as.character(c(Aeventexhumed,
 as.character(s$Aeventexhumed[1])))
 + QuadratEvent- c(QuadratEvent, q)
 + ST1 - c(ST1, sum(s$ST1, na.rm=TRUE))
 + ST2 - c(ST2, sum(s$ST2, na.rm=TRUE))
 + ST3 - c(ST3, sum(s$ST3, na.rm=TRUE))
 + ST4 - c(ST4, sum(s$ST4, na.rm=TRUE))
 + ST0 - c(ST0, sum(s$ST0, na.rm=TRUE))
 + Shells - c(Shells, sum(s$Shells, na.rm=TRUE))
 + Hatchlings - c(Hatchlings, sum(s$Hatchlings, na.rm=TRUE))
 + MaxHatch - c(MaxHatch, sum(s$MaxHatch, na.rm=TRUE))
 + DeadHatch - c(DeadHatch, sum(s$DeadHatch, na.rm=TRUE))
 + Oldeggs - c(Oldeggs, sum(s$Oldeggs, na.rm=TRUE))
 + TotalEggs - c(TotalEggs, sum(s$TotalEggs, na.rm=TRUE))
 + }
  responses - data.frame(QuadratEvent, ExDate, Sector, Quadrat,
 + Aeventexhumed, ST0, ST1, ST2, ST3, ST4, 
Shells,
 + Hatchlings, MaxHatch, DeadHatch, Oldeggs,
 + TotalEggs, stringsAsFactors=FALSE)
  responses$QuadratEvent - as.character(responses$QuadratEvent)
  data.to.analyze - join(responses, envir, by=QuadratEvent)
  data.to.analyze$NotHatched - data.to.analyze$TotalEggs -
  data.to.analyze$Shells
  data.to.analyze$Rayos - paste(Rayos, data.to.analyze$Rayos, 
sep=.)
 
  Hsuccess - Shells/TotalEggs
  tapply(Hsuccess, Aeventexhumed, mean, na.rm=TRUE)
 A B C 
 0.2156265 0.1288559 0.1124327 
  boxplot(HSuccess ~ Aeventexhumed, data = data.to.analyze, col = 
blue,
 + main = Hatching Success of Arribadas in 2010,
 + xlab = Arribada Event,
 + ylab = Hatching Success % (Shells / Total Eggs))
 
 http://r.789695.n4.nabble.com/file/n4536926/hatch_Aeventexhumed.png 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/R-generated-
 means-are-different-from-the-boxplot-tp4536926p4536926.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: identify with mfcol=c(1,2)

2012-04-05 Thread Petr PIKAL
Hi

It seems to me that probably split.screen or layout is preferable if you 
want specify graph for identification. But I am not an expert in this and 
after some testing identification does not work well with splitted screen. 
So you are probably out of luck.

Regards
Petr


 
 Please forgive my re-sending this question. I did not see any replies 
from
 my prior post. My apologies if I missed something.
 
 I would like to have a figure with two graphs. This is easily 
accomplished
 using mfcol:
 
 oldpar - par(mfcol=c(1,2))
 plot(x,y)
 plot(z,x)
 par(oldpar) 
 
 I run into trouble if I try to use identify with the two plots. If, 
after 
 identifying points on my first graph I hit the ESC key, or hitting stop 
 menu bar of my R session, the system stops the identification process, 
but
 fails to give me my second graph. Is there a way to allow for the 
 identification of points when one is plotting to graphs in a single 
graph 
 window? My code follows.
 
 plotter - function(first,second) {
   # Allow for two plots in on graph window.
   oldpar-par(mfcol=c(1,2))
 
   #Bland-Altman plot.
   plot((second+first)/2,second-first)
   abline(0,0)
   # Allow for indentification of extreme values.
   BAzap-identify((second+first)/2,second-first,labels = 
seq_along(data$Line))
   print(BAzap)
 
   # Plot second as a function of first value.
   plot(first,second,main=Limin vs. Limin,xlab=First 
(cm^2),ylab=Second (cm^3))
   # Add identity line.
   abline(0,1,lty=2,col=red)
   # Allow for identification of extreme values.
   zap-identify(first,second,labels = seq_along(data$Line))
   print(zap)
   # Add regression line.
   fit1-lm(first~second)
   print(summary(fit1))
   abline(fit1)
   print(summary(fit1)$sigma)
 
   # reset par to default values. 
   par(oldpar)
 
 }
 plotter(first,second)
 
 
 Thanks,
 John
 
 
 
 
 
 
 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 
 Confidentiality Statement:
 This email message, including any attachments, is for th...{{dropped:6}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-04 Thread Petr PIKAL
 
 Michael,
 
 First of all, thank you very much for your answer.
 I've read your 2 answers, but I'm not really sure that they corresponds 
to
 my problem of NAs.

You shall read answers more carefully

x-rnorm(20)
x[3:4]-NA
x[12:19]-NA
x
 [1] -0.30754528  0.07597988  NA  NA -0.50585319 
-1.60509616  0.31488672  2.16969731
 [9]  0.67755514 -1.83075111  0.72044482  NA  NA NA   NA   
NA
[17]  NA  NA  NA -0.96576934

library(zoo)

na.approx(x)
 [1] -0.30754528  0.07597988 -0.11796447 -0.31190883 -0.50585319 
-1.60509616  0.31488672  2.16969731
 [9]  0.67755514 -1.83075111  0.72044482  0.53308769  0.34573056 
0.15837343 -0.02898370 -0.21634083
[17] -0.40369795 -0.59105508 -0.77841221 -0.96576934
na.approx(x, maxgap=3)
 [1] -0.30754528  0.07597988 -0.11796447 -0.31190883 -0.50585319 
-1.60509616  0.31488672  2.16969731
 [9]  0.67755514 -1.83075111  0.72044482  NA  NA NA   NA   
NA
[17]  NA  NA  NA -0.96576934

Does exactly what you want as far as I understand what you described.

Regards
Petr


 I'll try to detail you a bit more.
 
 This problem concerns the second part of my program. In the first part, 
I've
 already created a timeseries object with the library (timeseries). I had 
to
 delete first all the wrong values in my data and replace it with NAs. 
 So my data contains already missing data (NAs), as I have cleaned it 
before.
 
 The thing is that sometimes I have small gaps of missing data (only 2 or 
3
 following) like in example 1 below:
 
 example 1:
 
 09/01/2008 12:00  1.93 
 09/01/2008 12:15  3.93 
 09/01/2008 12:30   NASo here you have a small gap with 
only
 2 NAs
 09/01/2008 12:45   NA 
 09/01/2008 13:00  4.93 
 09/01/2008 13:15  5.93
 
 But sometimes, always in the same file, I have big gaps, such as 10 or 
more
 NAs following each other like in example 2 below:
 
 example 2:
 
 09/01/2008 16:15  2.93
 09/01/2008 16:30  2.93
 09/01/2008 16:45  NA
 09/01/2008 17:00  NA
 09/01/2008 17:15  NA
 09/01/2008 17:30  NA
 09/01/2008 17:45  NA
 09/01/2008 18:00  NA  So here you have a big gap with more 
than 10
 NAs following each other
 09/01/2008 18:15  NA
 09/01/2008 18:30  NA
 09/01/2008 18:45  NA
 09/01/2008 19:00  NA
 09/01/2008 19:15  NA
 09/01/2008 19:30  NA
 09/01/2008 19:45  NA
 09/01/2008 20:00  NA
 09/01/2008 20:15  7.93
 09/01/2008 20:30  7.93
 
 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs),
 sometimes big or very big gaps (10 or 100 NAs following).
 
 The aim of my problem is to apply the function: na.approx(x) of the 
library
 (zoo) to fill NAs ONLY for small gaps.
 
 If I just do: apply(na.approx(x)), it will fill all the NAs of my data 
(big
 gaps + small gaps). It's exactly what I DON'T WANT.
 
 My problem is to say to R:  you apply the function (na.approx) to fill 
NAs
 ONLY if you see 4 NAs maximum following each other (small gaps) (like
 example 1). If you see more than 4 NAs following each other (big gaps 
like
 in example 2), you keep these NAs and you DON'T fill this big gap.
 
 My question is: how can I say this to R? I don't know how to do it.
 Hope I've been understandable this time ^^
 Thanks a lot again for all your answers!
 
 
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/filling-small-
 gaps-of-N-A-tp4528184p4528907.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: identify time span in date vector

2012-04-04 Thread Petr PIKAL
Hi

Can you please be more specific? Based on this input, what do you want as 
a result?

 set.seed(111)
 dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1)
 dates
 [1] 2007-08-01 2007-10-21 2007-12-08 2007-12-15 2008-01-29 
2008-02-14 2008-02-16 2008-03-01
 [9] 2008-04-02 2008-04-11


Regards
Petr

 
 Hello everyone,
 
 i try to identify the first element of a date vector, for which the 
 following condition holds: at least 3 more dates within the next 365 
days,
 but at least one of these must be between 3-12 month later.
 
 dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1)
 
 Has anyone an idea how to do this economically? I'll need to apply this 
to
 a large dataset with date vectors of various lengths and I can think 
only 
 of quite difficult algorithms :(
 
 Any ideas would be appreciated,
 Felix
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify time span in date vector

2012-04-04 Thread Petr PIKAL
Hi

 
 Dear Petr,
 
 thanks for taking your time. 
 
 For this input, the first element should be selected since there are 
more 
 than 3 more dates within one year (basically, all other dates are within 

 one year) and at least one of them is more than 3 month later.
 
 In the meantime, I came up with some code (probably) doing what I want:
 
 identify_first_date = function(dates)
 {
 within_one_year = as.matrix(dist(dates))  366  ### next 

 dates in same year?
 within_one_year[upper.tri(within_one_year, diag=TRUE)]=FALSE
 
 within_one_month = as.matrix(dist(dates))  91### next 
 dates within 90 days?
 within_one_month[upper.tri(within_one_month, diag=TRUE)]=FALSE
 
 dates[
which(
apply(within_one_year,2,sum)  apply(within_one_month,2,sum)  
 ### more dates in one year than in one month
apply(within_one_year,2,sum) =3   ### more than 4 
 dates in one year
)[1]]
 }
 
 I guess, the code could be improved, though, it takes some time.

Your first condition can be fulfilled by

c(as.numeric(diff(dates))365, F)  c(as.numeric(diff(dates))91,F))

so if you put in your function

identify_first_date2 = function(dates)
{
within_one_year = as.matrix(dist(dates))  366
within_one_year[upper.tri(within_one_year, diag=TRUE)]=FALSE

distance-as.numeric(diff(dates))

dates[ which( c(distance365, F)  c(distance91,F)  
apply(within_one_year,2,sum) =3)[1]]
}

You shall get some improvement, however I am still struggling to evaluate 
how many consecutive dates are within one year.




 
 Best,
 Felix
 
 
 -Ursprüngliche Nachricht-
 Von: Petr PIKAL [mailto:petr.pi...@precheza.cz] 
 Gesendet: Mittwoch, 4. April 2012 09:47
 An: Fischer, Felix
 Cc: r-help@r-project.org
 Betreff: Odp: [R] identify time span in date vector
 
 Hi
 
 Can you please be more specific? Based on this input, what do you want 
as a result?
 
  set.seed(111)
  dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1) dates
  [1] 2007-08-01 2007-10-21 2007-12-08 2007-12-15 2008-01-29 
 2008-02-14 2008-02-16 2008-03-01
  [9] 2008-04-02 2008-04-11
 
 
 Regards
 Petr
 
  
  Hello everyone,
  
  i try to identify the first element of a date vector, for which the 
  following condition holds: at least 3 more dates within the next 365
 days,
  but at least one of these must be between 3-12 month later.
  
  dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1)
  
  Has anyone an idea how to do this economically? I'll need to apply 
  this
 to
  a large dataset with date vectors of various lengths and I can think
 only 
  of quite difficult algorithms :(
  
  Any ideas would be appreciated,
  Felix
  
  
 [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] indexing in a function doesn't work?

2012-04-02 Thread Petr PIKAL
Hi

There is some mismatch in curly braces in your plotter function but when I 
try to use it I get

 
plotter(10,3,fram=rwb,framvec=rwb$prcnt.char.depth,obj=prcnt.char.depth,form1=
+ post.f.crwn.length~shigo.av,form2=post.f.crwn.length~shigo.av-1,
+ form3=leaf.area~(1/exp(shigo.av*x))*n,type=2,xlm=70,ylm=35)
Error in sub.plotter(i, fram, framvec, obj, form1, form2, form3, type,  : 
  object 'rwb' not found

You did not mention where we can get rwb so it is quite difficult to 
present any help.

Your function nesting seems to me also a little bit weird as the result 
from nested function is not known to enclosing function.


 test- function (a,b) {
+ g - a/b
+ sub.test - function (a, b, d=1) (e - a*b*d)
+ c(g, e)
+ }
 
 test(2,4)
Error in test(2, 4) : object 'e' not found

But when you define sub.test outside of test

 sub.test - function (a, b, d=1) {
+ e - a*b*d
+ e
+ }
 
 test- function (a,b) {
+ g - a/b
+ e - sub.test(a,b)
+ c(g, e)
+ }
 test(2,4)
[1] 5e-01 8e+04

everything works. And it seems to me easier to maintain smaller chunks of 
working code and put them together like a puzzle. Its a behaviour I was 
taught in ancient times when gosub was a standard way of Basic 
programming paradigm and much more preferred from goto.

Best regards
Petr

 



 

 
 Josh,
 
 Many thanks - here's a subset of the data and a couple examples:
 
 
plotter(10,3,fram=rwb,framvec=rwb$prcnt.char.depth,obj=prcnt.char.depth,form1=
 post.f.crwn.length~shigo.av,form2=post.f.crwn.length~shigo.av-1,
 form3=leaf.area~(1/exp(shigo.av*x))*n,type=2,xlm=70,ylm=35)
 
 plotter(10,3,fram=rwb, framvec=rwb$prcnt.char.depth, 
obj=prcnt.char.depth,
 form1= post.f.crwn.length~leaf.area, 
form2=post.f.crwn.length~leaf.area-1,
 form3=leaf.area~(1/exp(shigo.av*x))*n,type=1, xlm=1500, ylm=35,
 sx=.01,sn=25)
 
 
 
 
 plotter-function(a,b,fram,framvec,obj,form1,form2,form3, type=1, xlm, 
ylm,
 sx=.01,sn=25){
 g-ceiling(a/b)
 par(mfrow=c(b,g))
 num-rep(0,a)
 sub.plotter-function(i,fram,framvec,obj,form1,form2,form3,type,
 xlm,ylm,var1,var2){
 temp.i-fram[framvec =(i*.10),] #trees in the list that have an 
attribute
 less than or equal to a progressively larger percentage
  plot(form1, data=temp.i, xlim=c(0,xlm), ylim=c(0,ylm), 
main=((i-1)*.10))
 if(type==1){
  mod-lm(form2,data=temp.i)
 r2-summary(mod)$adj.r.squared
 num[i]-r2
  legend(bottomright, legend=signif(r2), col=black)
 abline(mod)
  num}
 else{
 if(type==2){
  try(mod-nls(form3, data=temp.i, start=list(x=sx,n=sn),
 na.action=na.omit), silent=TRUE)
 try(x1-summary(mod)$coefficients[1,1], silent=TRUE)
  try(n1-summary(mod)$coefficients[2,1], silent=TRUE)
 try(lines((1/exp(c(0:70)*x1)*n1)), silent=TRUE)
  try(num[i]-AIC(mod), silent=TRUE)
 try(legend(bottomright, legend=round(num[i],3) , col=black),
 silent=TRUE)
  try((num), silent=TRUE)
   }
 }}
 for(i in 0:a+1){
  num-sub.plotter(i,fram,framvec,obj,form1,form2,form3,type,xlm,ylm)
 }
 plot.cor-function(x){
 temp-a+1
 lengthx-c(1:temp)
 plot(x~c(1:temp))
 m2-lm(x~c(1:temp))
 abline(m2)
 n-summary(m2)$adj.r.squared
 legend(bottomright, legend=signif(n), col=black)
 slope-(coef(m2)[2])# slope
 values-(num)#values for aic or adj r2
 r2ofr2-(n) #r2 of r2 or AIC
 output-data.frame(lengthx,slope,values,r2ofr2)
 }
 plot.cor(num)
 write.csv(plot.cor(num)$output,output.csv) # can't seem to use
 paste(substitute(form3),.csv,sep=) to name it at the moment
 par(mfrow=c(1,1))
 }
 
 
 
 
 On Sun, Apr 1, 2012 at 3:25 PM, Joshua Wiley jwiley.ps...@gmail.com 
wrote:
 
  Hi,
 
  Glancing through your code it was not immediately obvious to me why it
  does not work, but I can see a lot of things that could be simplified.
   It would really help if you could give us a reproducible example.
  Find/upload/create (in R) some data, and examples of how you would use
  the function.  Right now, I can only guess what your data etc. are
  like and based on your description plus what the code you wrote seems
  to expect to be given.  I could try to give code suggestions, but I
  have no easy way of testing them so it would be very easy to make
  typos, etc.  Then you just get back my edits to your code that still
  do not work and maybe it is because of something fundamentally wrong
  with what I have done, a simple typo, or something else still wrong in
  your code that I did not fix.
 
  Anyway, if you send some data and an example using your function
  (i.e., using the data you send, write our form1, form2, type, etc.), I
  will take a look at your function and see if I can make it run.
 
  Cheers,
 
  Josh
 
  On Sun, Apr 1, 2012 at 3:13 PM, Benjamin Caldwell
  btcaldw...@berkeley.edu wrote:
   Hello,
  
   I've written a small function that's supposed to save me some time, 
and
   it's ending up killing it- the intention is to iteratively subset a
  dataset
   fram on framevec, fit a model (either lm or nls depending on type) 
and
   return the r2 or AIC from the model, respectively. Although as far 
as I
  can
   tell in my code the plots are 

Re: [R] Calling Dynamic Variables names

2012-03-29 Thread Petr PIKAL
Hi

First of all you shall change your dataset structure to list. It will help 
you in future.


 
 HI,
 
 Need help in calling the dynamic variables 
 
 I have 2 datasets . FIrst one(mdata) is having the metadata and second
 one(dataset) has got the data. 
 
 sample 
 *mdata *
 Variable (header)
 
 attribute1
 attribute2
 attribute3
 attribute4
 
 *dataset *
 
 attribute1 attribute2 attribute3attribute4 
 
 1 334   12
 3 45  1  09
 34   22  12   12
 56   6 16  77
 ..
 so i have written a forloop 
 
 
 code :  i want to find the mean for each attribites
 
 for( i in mdata$vairable)
 {
   print (mean(dataset$i)# wants to call i value

In cycle you shall use different kind of subsetting.

print (mean(dataset[,i]))

Regards
Petr


 }
 
 
 please help in regard.
 
 THanks,
 Santosh
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Calling-
 Dynamic-Variables-names-tp4514820p4514820.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   10   >