date:20090531

[R] how to sort data frame order by column?

2009-05-31 Thread Угодай n/a

I have a data frame, for exampe

 dat - data.frame(a=rnorm(5),b=rnorm(5),c=rnorm(5))
   ab  c
1 -0.1731141  0.002453991  0.1180976
2  1.2142024 -0.413897606  0.7617472
3 -0.9428484 -0.609312786  0.5132441
4  0.1343336  0.178208961  0.7509650
5 -0.1402286 -0.333476839 -0.4959459

How to make dat2 from dat, where source data frame be ordered by any column?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OT ; Interview with David Smith, REvolution Computing

2009-05-31 Thread Ajay ohri

Dear R community,

Here is an interview with David Smith, Director of Community at REvolution
Computing. David talks of the exciting work being done at REvolution to help
make R reach out to even more users.

http://www.decisionstats.com/2009/05/29/interview-david-smith-revolution-computing/

Best,

Ajay Ohri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to sort data frame order by column?

2009-05-31 Thread Linlin Yan

e.g.
dat[ order(dat$a), ]

On Sun, May 31, 2009 at 2:34 PM, Угодай n/a ugo...@gmail.com wrote:
 I have a data frame, for exampe

 dat - data.frame(a=rnorm(5),b=rnorm(5),c=rnorm(5))
           a            b          c
 1 -0.1731141  0.002453991  0.1180976
 2  1.2142024 -0.413897606  0.7617472
 3 -0.9428484 -0.609312786  0.5132441
 4  0.1343336  0.178208961  0.7509650
 5 -0.1402286 -0.333476839 -0.4959459

 How to make dat2 from dat, where source data frame be ordered by any column?

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] logical vector as a matrix

2009-05-31 Thread Grześ


Thanks a lot! :)

Grześ wrote:
 
 I have a vector like this:
 h - c(4, 6, NA, 12) 
 and I create the secound logical vector like this:
 g - c(TRUE, TRUE, FALSE, TRUE) 
 
 And my problem is that I would like to get  a new m vector as a rasult
 h and g( as dot-matrix printer) but with missed NA value, for
 example:
 
 m = (4,6,12)
 Do you have any idea?
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/logical-vector-as-a--matrix-tp23785253p23796738.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] logical vector as a matrix

2009-05-31 Thread Grześ


Thanks a lot! :)

Linlin Yan wrote:
 
 On Sat, May 30, 2009 at 2:48 AM, Grześ gregori...@gmail.com wrote:

 I have a vector like this:
 h - c(4, 6, NA, 12)
 and I create the secound logical vector like this:
 g - c(TRUE, TRUE, FALSE, TRUE)
 Why don't you create vector g like this:
 g - ! is.na(h)
 

 And my problem is that I would like to get  a new m vector as a rasult
 h
 and m but with missed NA value, for example:

 m = (4,6,12)
 Do you have any idea?
 As what you tried to do:
 m - h[g] # which got (4,6,12)
 you can directly use:
 m - h[ ! is.na(h) ]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/logical-vector-as-a--matrix-tp23785253p23796761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] logical vector as a matrix

2009-05-31 Thread Grześ


Thanks Jorge Ivan Velez!
-- 
View this message in context: 
http://www.nabble.com/logical-vector-as-a--matrix-tp23785253p23796796.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] IP-Address

2009-05-31 Thread Wacek Kusnierczyk

edwin Sendjaja wrote:
 Hi VQ,

 Thank you. It works like charm. But I think Peter's code is faster. What is 
 the difference? 
   

i think peter's code is more r-elegant, though less generic.  here's a
quick test, with not so surprising results.  gsubfn is implemented in r,
not c, and it is painfully slow in this test. i also added gabor's
suggestion.

library(gsubfn)
library(gtools)
library(rbenchmark)

n = 1000
df = data.frame(
   a=rnorm(n),
   b = rnorm(n),
   c = rnorm(n),
   ip = replicate(n, paste(sample(255, 4), collapse='.'),
simplify=TRUE))
benchmark(columns=c('test', 'elapsed'), replications=10, order=NULL,
   peda={
  connection = textConnection(as.character(df$ip))
  o = do.call(order, read.table(connection, sep='.'))
  close(connection)
  df[o, ] },
   waku=df[order(gsubfn(perl=TRUE,
  '[0-9]+',
  ~ sprintf('%03d', as.integer(x)),
  as.character(df$ip))), ],
   gagr=df[mixedorder(df$ip), ] )
 
# peda 0.070
# waku 7.070
# gagr 4.710


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mapply

2009-05-31 Thread Dimitris Rizopoulos

well, you do not have to compute the pairwise difference each time; for 
instance, you could use something like this (untested):


out1 - outer(a, c, -)
out2 - outer(b, c, -)

u - v - 1:5
mat - matrix(0, length(u), length(u))
for (i in seq_len(u)) {
for (j in seq_len(v)) {
res1 - colSums(abs(out1 - i) == 0)  0
res2 - colSums(abs(out2 - j) == 0)  0
mat[i, j] - sum(res1  res2)
}
}
mat


I hope it helps.

Best,
Dimitris


KARAVASILIS GEORGE wrote:

Hello, R users.
I would like to count the number of triples (r_i, s_j, t_k) with r_i, 
s_j, t_k distinct and abs((r_i-t_k)-u)=0 and abs((s_j-t_k)-v)=0, where 
r_i, s_j, t_k are the elements of three vectors a,b,c with different 
lengths and u,v=1:n. I have solved this problem writing a subroutine in 
Fortran an calling .Fortran(). I would like to find another way to use  
functions like mapply. I tried this one:


 xx - mapply( function(u,v)   {
 kk - outer(a, c, function(s,t) abs((s-t)-u)==0)
 ll - outer(b, c, function(s,t) abs((s-t)-v)==0)
 sum( outer( which(kk==TRUE, TRUE)[,2], which(ll==TRUE, TRUE)[,2], 
function(s,t) s==t)  )  },

 rep(1:n, each=n), rep(1:n, times=n))

 xx - matrix(xx, nrow=n, ncol=n, byrow=TRUE)

It works but it is rather slow. Taking into account that my vectors have 
lengths 3000, and n is from 50 to 200, can I do something to improve the 
running  time of the above code?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spatiotemporal correlation function

2009-05-31 Thread Bill.Venables

This is a new one on me.

Do you mean the Kronecker delta, or the Kronecker product?

The Kronecker product is more likely.  In which case try ?kronecker.  It is 
available as an operator, %x%, as well.

Bill Venables.

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
FMH [kagba2...@yahoo.com]
Sent: 30 May 2009 18:24
To: r-help@r-project.org
Subject: [R] Spatiotemporal correlation function

Hi,

I'm trying to compute  the spatiotemporal correlation matrix by using Delta 
Kronecker products of spatial and temporal correlation matrix  in R, but didn't 
find any delta Kronecker's operator in R. The operators in matrix such as 
multiplication, addition, eigen values/vector and etc is easily to find and 
used.

Could someone help me, please?

Cheers.

Firdaus




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange behavior when reading csv - line wraps

2009-05-31 Thread Martin Tomko


Dear Jim,
with the help of Ted, we diagnosed that the cause is in the extreme 
variability in line length during reading in. As the table column number 
is apparently determined fro mthe first five lines, what exceeds this 
length gets automatically on the next line.
I am now trying to find a way to read in the data despite this. I have 
no control over the table extent, the only thing that would make sense 
according to my data would be to read in a fixed number of columns and 
merge all remaining columns as a long string in the last one. No idea 
how to do this, though.


Thanks
Martin


jim holtman wrote:
It is still not clear to me exactly how you want to read the lines 
in.  If the lines have a variable number of fields, and some of the 
lines might be wrapped, is there some way to determine where the start 
of each line is.
 
If you are reading them in with read.csv, then the system is assuming 
that each line starts a new row.  If this is not the case, then you 
will have to state the rules that determine where the lines start.  
You can always read the data in with 'scan' to separate each line and 
then do whatever processing is required to put together the rows in a 
data frame that you want.
 
In one of your examples, you indicated that the line was split 
starting at the word kempten; if this is in the middle of the line, 
then you would have to create the break after reading the line in with 
'scan' and then creating the rows in the dataframe.  All of this can 
be done in R if you can state what the criteria is.
On Sat, May 30, 2009 at 4:32 AM, Martin Tomko martin.to...@geo.uzh.ch 
mailto:martin.to...@geo.uzh.ch wrote:


Jim,
the two lines I put in are the actual problematic input lines.
In these examples, there are no quotes nor # signs, although I
have no means to make sure they do not occur in the inputs (any
hints how I could deal with that?).
I am trying to avoid as much pre-processing outside R as possible,
and I have to process about 500 files with up to 3000 records
each, so I need a more or less automated/batch solution. - so any
string substitution will have to occur in R. But for the moment, I
do not see a reaason for substitution, and the wrapping still occurs.

Cheers
Martin



jim holtman wrote:

You need to supply the actual input line so we can see what is
happening.  Are you sure you do not have unbalanced quotes in
your input (try quote='') or do you have comment characters
(#) in your input?

On Fri, May 29, 2009 at 3:15 PM, Martin Tomko
martin.to...@geo.uzh.ch mailto:martin.to...@geo.uzh.ch
mailto:martin.to...@geo.uzh.ch
mailto:martin.to...@geo.uzh.ch wrote:

   Dear All,
   I am observing a strange behavior and searching the
archives and
   help pages didn't help much.
   I have a csv with a variable number of fields in each line.

   I use
   dataPoints - read.csv(inputFile, head=FALSE, sep=;,fill
=TRUE);

   to read it in, and it works. But - some lines are long and
'wrap',
   or split and continue on the next line. So when I check the
dim of
   the frame, they are not correct and I can see when I do a
printout
   that the lines is split into two in the frame. I checked
the input
   file and all is good.

   an example of the input is:
 
 37;2175168475;13;8.522729;47.19537;16366...@n00;30;sculpture;bird;tourism;animal;statue;canon;eos;rebel;schweiz;switzerland;eagle;swiss;adler;skulptur;zug;1750;28;tamron;f28;canton;tourismus;vogel;baar;kanton;xti;tamron1750;1750mm;tamron1750mm;400d;rabbitriotnet;


   where the last values occurs on the next line in the data
frame.

   It does not have to be the last value, as in the follwong
example,
   the word kempten starts the next line:
 
 39;167757703;12;10.309295;47.724545;21903...@n00;36;white;building;tower;clock;clouds;germany;bayern;deutschland;bavaria;europa;europe;eagle;adler;eu;wolke;dome;townhall;rathaus;turm;weiss;allemagne;europeanunion;bundesrepublik;gebaeude;glocke;brd;allgau;kuppel;europ;kempten;niemcy;europo;federalrepublic;europaischeunion;europaeischeunion;germanio;


   What could be the reason?

   I ws thinking about solving the issue by using a different
   separator, that I would use for the first 7 fields and
   concatenating all of the remaining values into a single stirng
   value, but could not figure out how to do such a
substitution in
   R. Unfortunately, on my system I cannot specify a range for
sed...

   Thanks for any help/pointers
   Martin

   __
   R-help@r-project.org mailto:R-help@r-project.org

[R] Bug in truncgof package?

2009-05-31 Thread Carlos J. Gil Bellosta

Dear R-helpers,

I was testing the truncgof CRAN package, found something that looked
like a bug, and did my job: contacted the maintainer. But he did not
reply, so I am resending my query here.

I installed package truncgof and run the example for function ad.test. I
got the following output:

set.seed(123)
treshold - 10
xc  - rlnorm(100, 2, 2)# complete sample
xt - xc[xc = treshold]# left truncated sample
ad.test(xt, plnorm, list(meanlog = 2, sdlog = 2), H = 10)


Supremum Class Anderson-Darling Test

data:  xt 
AD = 3.124, p-value = 0.12
alternative hypothesis: two.sided 

treshold = 10, simulations: 100


So I cannot reject the hipothesis (at a standard confidence level) that
the original sample comes from a lognormal distribution (as it is the
case).

But let us try to iterate on this example:

set.seed( 123 )
treshold - 10

foo - function(){
  xc  - rlnorm(100, 2, 2) # complete sample
  xt - xc[xc = treshold] # left truncated sample
  ks.test(xt, plnorm, list(meanlog = 2, sdlog = 2), H =
10)$p.value
}

results - replicate( 100, foo() )


Then:

 table( results )
results
   0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09  0.1 0.11 0.16 0.18
0.19  0.2 
  257931234112211
32 
0.21 0.22 0.26 0.27 0.28  0.3 0.31 0.32 0.33 0.36 0.38  0.4 0.44 0.49
0.54 0.55 
   22131211121211
21 
0.56 0.57 0.62  0.7 0.76 0.78 0.96 0.98 
   12111111 


This is, in a 45% of the cases, you would reject the H_0 hypothesis,
which happens to be true, at the 5% standard confidence level.

Do you think this behaviour is buggy? If so, given that the maintainer
does not seem to be contactable, what would be the next step to take?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bug in gmodels CrossTable()?

2009-05-31 Thread Jakson Alves de Aquino

Is the code below showing a bug in Crosstable()? My expectation was that
the values produced by xtabs were rounded instead of truncated:

library(gmodels)
abc - c(a, a, b, b, c, c)
def - c(d, e, f, f, d, e)
wgt - c(0.8, 0.6, 0.4, 0.5, 1.4, 1.3)

xtabs(wgt ~ abc + def)

CrossTable(xtabs(wgt ~ abc + def), prop.r = F, prop.c = F,
  prop.t = F, prop.chisq = F)

-- 
Jakson Aquino
Professor of Political Science
Federal University of Ceara, Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] grid.edit() for ggplot2

2009-05-31 Thread baptiste auguie


Dear all,


I'm trying to access and modify grobs in a ggplot2 plot. The basic  
idea for raw Grid objects I understand from Paul Murrell's R graphics  
book, or this page of examples,


http://www.stat.auckland.ac.nz/~paul/grid/copygrob/copygrobs.R

However I can't figure out how to apply this to a ggplot (basically I  
don't know how to write a syntactically correct gPath),



p - # minimal example
qplot(0,0)+ annotate(text,0,0,label=test)

g - # store the plot as a grob
ggplotGrob(p)

# structure of the grob
grid.ls(g) # rather large!

# find a particular grob in the gTree
getGrob(g,texts, grep = T)


# next step, modify, say, the colour of these grobs
grid.edit() # what do I put in here?


Thanks for any piece of advice,

baptiste

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] convert the contents of a date.frame to a matrix

2009-05-31 Thread Hongyuan Cao

Dear R user,

I am trying to convert the contents of a date.frame to a matrix. Since there
are negative values in the date.frame, when I use data.matrix(x,
rownames.force = NA), the resulting matrix is not the same as the original
one. Basically I think R treats the numbers in the date.frame as character
and converts it to corresponding numerics.

Any idea on this issue?

Many Thanks,

Hongyuan

 x[1:3,]
   ALL ALL.1 ALL.2 ALL.3 ALL.4 ALL.5 ALL.6 ALL.7 ALL.8 ALL.9 ALL.10 ALL.11
2 -214  -139   -76  -135  -106  -138   -72  -413 5   -88   -165-67
3 -153   -73   -49  -114  -125   -85  -144  -260  -127  -105   -155-93
4  -58-1  -307   265   -76   215   238 7   10642-71 84
  ALL.12 ALL.13 ALL.14 ALL.15 ALL.16 ALL.17 ALL.18 ALL.19 ALL.20 ALL.21
ALL.22
2-92   -113   -107   -117   -476-81-44 17   -144   -247
-74
3   -119   -147-72   -219   -213   -150-51   -229   -199-90
-321
4-31   -118   -126-50-18   -119100 79   -157   -168
-11
  ALL.23 ALL.24 ALL.25 ALL.26  AML AML.1 AML.2 AML.3 AML.4 AML.5 AML.6 AML.7
2   -120-81   -112   -273  -20 7  -213   -25   -72-415  -318
3   -263   -150   -233   -327 -207  -100  -252   -20  -139  -116  -114  -192
4   -114-85-78-76  -50   -57   136   124-1  -125 2   -95
  AML.8 AML.9 AML.10 ALL.27 ALL.28 ALL.29 ALL.30 ALL.31 ALL.32 ALL.33 ALL.34
2   -32  -124   -135   -342-87 22   -243   -130   -256-62 86
3   -49   -79   -186   -200   -248   -153   -218   -177   -249-23-36
449   -37-70 41262 17   -163-28   -410 -7   -141
  ALL.35 ALL.36 ALL.37 ALL.38 ALL.39 ALL.40 ALL.41 ALL.42 ALL.43 ALL.44
ALL.45
2   -146   -187-56-55-59   -131   -154-79-76-34
-95
3-74   -187-43-44   -114   -126   -136   -118-98   -144
-118
4170312 43 12 23-50 49-30   -153-17
59
  ALL.46 AML.11 AML.12 AML.13 AML.14 AML.15 AML.16 AML.17 AML.18 AML.19
AML.20
2-12-21   -202   -112   -118-90   -137   -157   -172-47
-62
3   -172-13   -274   -185   -142-87-51   -370   -122   -442
-198
4 12  8 59 24212102-82-77 38-21
-5
  AML.21 AML.22 AML.23 AML.24
2-58   -161-48   -176
3   -217   -215   -531   -284
4 63-46   -124-81

 help(as.matrix)
 y = data.matrix(x, rownames.force = NA)
 y[1:3,]
  ALL ALL.1 ALL.2 ALL.3 ALL.4 ALL.5 ALL.6 ALL.7 ALL.8 ALL.9 ALL.10 ALL.11
2 19575   620761358   466   399  1967   472 97377
3 106   533   4673739   43389   2786812 83432
4 499 1   315  1458   407   891  1096  1819   718  1320476   1436
  ALL.12 ALL.13 ALL.14 ALL.15 ALL.16 ALL.17 ALL.18 ALL.19 ALL.20 ALL.21
ALL.22
2446 25 11 30489428308   1156 52178
381
3 30 87438159244 61346188103482
196
4219 33 37383186 28474   2449 64101
23
  ALL.23 ALL.24 ALL.25 ALL.26 AML AML.1 AML.2 AML.3 AML.4 AML.5 AML.6 AML.7
2 35485 23278 173  1178   186   254   545   362   602   318
3203 81167330 181 1   228   190753920   181
4 26496467596 443   276   869   867 155   775   676
  AML.8 AML.9 AML.10 ALL.27 ALL.28 ALL.29 ALL.30 ALL.31 ALL.32 ALL.33 ALL.34
2   27367102169544   1043 87 63272294   1226
3   386   615190 94196 86 74130267109170
4  1569   360611   1004   1162897 43241404308 48
  ALL.35 ALL.36 ALL.37 ALL.38 ALL.39 ALL.40 ALL.41 ALL.42 ALL.43 ALL.44
ALL.45
2 55130253281376 38 89466361133
329
3372130213238 24 33 66 32397 36
25
4725   1231   1041488   1028316   1748222 60 50
1020
  ALL.46 AML.11 AML.12 AML.13 AML.14 AML.15 AML.16 AML.17 AML.18 AML.19
AML.20
2 22131170 21 47381 36 67 77279
403
3 52 42245100 97374236166 28265
130
4397   1714   1932915   1289409316293   1054136
351
  AML.21 AML.22 AML.23 AML.24
2218132324 76
3 74209344140
4   1015420 65342

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in gmodels CrossTable()?

2009-05-31 Thread Marc Schwartz


On May 31, 2009, at 7:51 AM, Jakson Alves de Aquino wrote:

Is the code below showing a bug in Crosstable()? My expectation was  
that

the values produced by xtabs were rounded instead of truncated:

library(gmodels)
abc - c(a, a, b, b, c, c)
def - c(d, e, f, f, d, e)
wgt - c(0.8, 0.6, 0.4, 0.5, 1.4, 1.3)

xtabs(wgt ~ abc + def)

CrossTable(xtabs(wgt ~ abc + def), prop.r = F, prop.c = F,
 prop.t = F, prop.chisq = F)



CrossTable() is designed to take one or two vectors, which are then  
[cross-]tabulated to yield integer counts, OR a matrix of integer  
counts, not fractional values. In the latter case, it is presumed that  
the matrix is the result of an 'a priori' cross-tabulation operation  
such as the use of table().


The output of xtabs() above is:

 xtabs(wgt ~ abc + def)
   def
abc   d   e   f
  a 0.8 0.6 0.0
  b 0.0 0.0 0.9
  c 1.4 1.3 0.0



The relevant output of CrossTable() in your example above shows:


 | def
 abc | d | e | f | Row Total |
-|---|---|---|---|
   a | 0 | 0 | 0 | 1 |
-|---|---|---|---|
   b | 0 | 0 | 0 | 0 |
-|---|---|---|---|
   c | 1 | 1 | 0 | 2 |
-|---|---|---|---|
Column Total | 2 | 1 | 0 | 5 |
-|---|---|---|---|



The internal table object that would be generated here is effectively:

 addmargins(xtabs(wgt ~ abc + def))
 def
abc d   e   f Sum
  a   0.8 0.6 0.0 1.4
  b   0.0 0.0 0.9 0.9
  c   1.4 1.3 0.0 2.7
  Sum 2.2 1.9 0.9 5.0



The textual output of CrossTable() is internally formatted using  
formatC(..., format = d), which is an integer based format:


 formatC(addmargins(xtabs(wgt ~ abc + def)), format = d)
 def
abc   d e f Sum
  a   0 0 0 1
  b   0 0 0 0
  c   1 1 0 2
  Sum 2 1 0 5



In other words, you are getting the integer coerced values of the  
individual cells and then the same for the column, row and table totals:


 matrix(as.integer(addmargins(xtabs(wgt ~ abc + def))), 4, 4)
 [,1] [,2] [,3] [,4]
[1,]0001
[2,]0000
[3,]1102
[4,]2105



If you review ?as.integer, you will note the following in the 'Value'  
section:


  Non-integral numeric values are truncated towards zero (i.e.,  
as.integer(x) equals trunc(x) there)




The output is correct, if confusing, but you are really using the  
function in a fashion that is not intended.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in gmodels CrossTable()?

2009-05-31 Thread Jakson Alves de Aquino

Dear Marc Schwartz,

You are correct: there is no bug in CrossTable(). To get what I want I
should have done:

CrossTable(round(xtabs(wgt ~ abc + def)), prop.r = F, prop.c = F,
  prop.t = F, prop.chisq = F)

Thank you for the explanation!

Jakson


Marc Schwartz wrote:
 On May 31, 2009, at 7:51 AM, Jakson Alves de Aquino wrote:
 
 Is the code below showing a bug in Crosstable()? My expectation was that
 the values produced by xtabs were rounded instead of truncated:

 library(gmodels)
 abc - c(a, a, b, b, c, c)
 def - c(d, e, f, f, d, e)
 wgt - c(0.8, 0.6, 0.4, 0.5, 1.4, 1.3)

 xtabs(wgt ~ abc + def)

 CrossTable(xtabs(wgt ~ abc + def), prop.r = F, prop.c = F,
  prop.t = F, prop.chisq = F)
 
 
 CrossTable() is designed to take one or two vectors, which are then
 [cross-]tabulated to yield integer counts, OR a matrix of integer
 counts, not fractional values. In the latter case, it is presumed that
 the matrix is the result of an 'a priori' cross-tabulation operation
 such as the use of table().
 
 The output of xtabs() above is:
 
 xtabs(wgt ~ abc + def)
def
 abc   d   e   f
   a 0.8 0.6 0.0
   b 0.0 0.0 0.9
   c 1.4 1.3 0.0
 
 
 
 The relevant output of CrossTable() in your example above shows:
 
 
  | def
  abc | d | e | f | Row Total |
 -|---|---|---|---|
a | 0 | 0 | 0 | 1 |
 -|---|---|---|---|
b | 0 | 0 | 0 | 0 |
 -|---|---|---|---|
c | 1 | 1 | 0 | 2 |
 -|---|---|---|---|
 Column Total | 2 | 1 | 0 | 5 |
 -|---|---|---|---|
 
 
 
 The internal table object that would be generated here is effectively:
 
 addmargins(xtabs(wgt ~ abc + def))
  def
 abc d   e   f Sum
   a   0.8 0.6 0.0 1.4
   b   0.0 0.0 0.9 0.9
   c   1.4 1.3 0.0 2.7
   Sum 2.2 1.9 0.9 5.0
 
 
 
 The textual output of CrossTable() is internally formatted using
 formatC(..., format = d), which is an integer based format:
 
 formatC(addmargins(xtabs(wgt ~ abc + def)), format = d)
  def
 abc   d e f Sum
   a   0 0 0 1
   b   0 0 0 0
   c   1 1 0 2
   Sum 2 1 0 5
 
 
 
 In other words, you are getting the integer coerced values of the
 individual cells and then the same for the column, row and table totals:
 
 matrix(as.integer(addmargins(xtabs(wgt ~ abc + def))), 4, 4)
  [,1] [,2] [,3] [,4]
 [1,]0001
 [2,]0000
 [3,]1102
 [4,]2105
 
 
 
 If you review ?as.integer, you will note the following in the 'Value'
 section:
 
   Non-integral numeric values are truncated towards zero (i.e.,
 as.integer(x) equals trunc(x) there)
 
 
 
 The output is correct, if confusing, but you are really using the
 function in a fashion that is not intended.
 
 HTH,
 
 Marc Schwartz
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert the contents of a date.frame to a matrix

2009-05-31 Thread David Winsemius


Unable to reproduce:

 ?data.matrix
 x - read.table(textConnection(  ALL ALL.1 ALL.2 ALL.3 ALL.4 ALL.5  
ALL.6 ALL.7 ALL.8 ALL.9 ALL.10 ALL.11
+ 2 -214  -139   -76  -135  -106  -138   -72  -413 5   -88
-165-67
+ 3 -153   -73   -49  -114  -125   -85  -144  -260  -127  -105
-155-93
+ 4  -58-1  -307   265   -76   215   238 7   10642 
-71 84), header=TRUE)

 x
   ALL ALL.1 ALL.2 ALL.3 ALL.4 ALL.5 ALL.6 ALL.7 ALL.8 ALL.9 ALL.10  
ALL.11
2 -214  -139   -76  -135  -106  -138   -72  -413 5   -88   -165 
-67
3 -153   -73   -49  -114  -125   -85  -144  -260  -127  -105   -155 
-93
4  -58-1  -307   265   -76   215   238 7   10642 
-71 84

 data.matrix(x)
   ALL ALL.1 ALL.2 ALL.3 ALL.4 ALL.5 ALL.6 ALL.7 ALL.8 ALL.9 ALL.10  
ALL.11
2 -214  -139   -76  -135  -106  -138   -72  -413 5   -88   -165 
-67
3 -153   -73   -49  -114  -125   -85  -144  -260  -127  -105   -155 
-93
4  -58-1  -307   265   -76   215   238 7   10642 
-71 84


You may have something else in that dataframe that is not apparent on  
a simple print display. Can you instead provide the results of dput(x)?


--
David Winsemius

On May 31, 2009, at 9:29 AM, Hongyuan Cao wrote:


Dear R user,

I am trying to convert the contents of a date.frame to a matrix.  
Since there

are negative values in the date.frame, when I use data.matrix(x,
rownames.force = NA), the resulting matrix is not the same as the  
original
one. Basically I think R treats the numbers in the date.frame as  
character

and converts it to corresponding numerics.

Any idea on this issue?

Many Thanks,

Hongyuan

x[1:3,]
  ALL ALL.1 ALL.2 ALL.3 ALL.4 ALL.5 ALL.6 ALL.7 ALL.8 ALL.9 ALL.10  
ALL.11
2 -214  -139   -76  -135  -106  -138   -72  -413 5   -88
-165-67
3 -153   -73   -49  -114  -125   -85  -144  -260  -127  -105
-155-93
4  -58-1  -307   265   -76   215   238 7   10642 
-71 84

 ALL.12 ALL.13 ALL.14 ALL.15 ALL.16 ALL.17 ALL.18 ALL.19 ALL.20 ALL.21
ALL.22
2-92   -113   -107   -117   -476-81-44 17   -144
-247

-74
3   -119   -147-72   -219   -213   -150-51   -229   -199 
-90

-321
4-31   -118   -126-50-18   -119100 79   -157
-168

-11
 ALL.23 ALL.24 ALL.25 ALL.26  AML AML.1 AML.2 AML.3 AML.4 AML.5 AML. 
6 AML.7
2   -120-81   -112   -273  -20 7  -213   -25   -72-4 
15  -318
3   -263   -150   -233   -327 -207  -100  -252   -20  -139  -116   
-114  -192
4   -114-85-78-76  -50   -57   136   124-1  -125  
2   -95
 AML.8 AML.9 AML.10 ALL.27 ALL.28 ALL.29 ALL.30 ALL.31 ALL.32 ALL.33  
ALL.34
2   -32  -124   -135   -342-87 22   -243   -130   -256 
-62 86
3   -49   -79   -186   -200   -248   -153   -218   -177   -249 
-23-36
449   -37-70 41262 17   -163-28   -410  
-7   -141

 ALL.35 ALL.36 ALL.37 ALL.38 ALL.39 ALL.40 ALL.41 ALL.42 ALL.43 ALL.44
ALL.45
2   -146   -187-56-55-59   -131   -154-79-76 
-34

-95
3-74   -187-43-44   -114   -126   -136   -118-98
-144

-118
4170312 43 12 23-50 49-30   -153 
-17

59
 ALL.46 AML.11 AML.12 AML.13 AML.14 AML.15 AML.16 AML.17 AML.18 AML.19
AML.20
2-12-21   -202   -112   -118-90   -137   -157   -172 
-47

-62
3   -172-13   -274   -185   -142-87-51   -370   -122
-442

-198
4 12  8 59 24212102-82-77 38 
-21

-5
 AML.21 AML.22 AML.23 AML.24
2-58   -161-48   -176
3   -217   -215   -531   -284
4 63-46   -124-81


help(as.matrix)
y = data.matrix(x, rownames.force = NA)
y[1:3,]
 ALL ALL.1 ALL.2 ALL.3 ALL.4 ALL.5 ALL.6 ALL.7 ALL.8 ALL.9 ALL.10  
ALL.11
2 19575   620761358   466   399  1967   472  
97377
3 106   533   4673739   43389   2786812  
83432
4 499 1   315  1458   407   891  1096  1819   718  1320476
1436

 ALL.12 ALL.13 ALL.14 ALL.15 ALL.16 ALL.17 ALL.18 ALL.19 ALL.20 ALL.21
ALL.22
2446 25 11 30489428308   1156 52 
178

381
3 30 87438159244 61346188103 
482

196
4219 33 37383186 28474   2449 64 
101

23
 ALL.23 ALL.24 ALL.25 ALL.26 AML AML.1 AML.2 AML.3 AML.4 AML.5 AML.6  
AML.7
2 35485 23278 173  1178   186   254   545   362
602   318
3203 81167330 181 1   228   1907539 
20   181
4 26496467596 443   276   869   867 155
775   676
 AML.8 AML.9 AML.10 ALL.27 ALL.28 ALL.29 ALL.30 ALL.31 ALL.32 ALL.33  
ALL.34
2   27367102169544   1043 87 63272 
294   1226
3   386   615190 94196 86 74130267 
109170
4  1569   360611   1004   1162897 43241404 
308 48

 ALL.35 ALL.36

Re: [R] strange behavior when reading csv - line wraps

2009-05-31 Thread jim holtman

You can do something like this: count the number of fields in each line of
the file and use the max to determine the number of columns for read.table:

file - '/tempxx.txt'
maxFields - max(count.fields(file))  # max
# now setup read.table for max number
input - read.table(file, colClasses=rep(NA, maxFields), fill=TRUE,
col.names=paste(V, seq(maxFields), sep=''))


On Sun, May 31, 2009 at 6:06 AM, Martin Tomko martin.to...@geo.uzh.chwrote:

 Dear Jim,
 with the help of Ted, we diagnosed that the cause is in the extreme
 variability in line length during reading in. As the table column number is
 apparently determined fro mthe first five lines, what exceeds this length
 gets automatically on the next line.
 I am now trying to find a way to read in the data despite this. I have no
 control over the table extent, the only thing that would make sense
 according to my data would be to read in a fixed number of columns and merge
 all remaining columns as a long string in the last one. No idea how to do
 this, though.

 Thanks
 Martin


 jim holtman wrote:

 It is still not clear to me exactly how you want to read the lines in.  If
 the lines have a variable number of fields, and some of the lines might be
 wrapped, is there some way to determine where the start of each line is.
  If you are reading them in with read.csv, then the system is assuming
 that each line starts a new row.  If this is not the case, then you will
 have to state the rules that determine where the lines start.  You can
 always read the data in with 'scan' to separate each line and then do
 whatever processing is required to put together the rows in a data frame
 that you want.
  In one of your examples, you indicated that the line was split starting
 at the word kempten; if this is in the middle of the line, then you would
 have to create the break after reading the line in with 'scan' and then
 creating the rows in the dataframe.  All of this can be done in R if you can
 state what the criteria is.
 On Sat, May 30, 2009 at 4:32 AM, Martin Tomko 
 martin.to...@geo.uzh.chmailto:
 martin.to...@geo.uzh.ch wrote:

Jim,
the two lines I put in are the actual problematic input lines.
In these examples, there are no quotes nor # signs, although I
have no means to make sure they do not occur in the inputs (any
hints how I could deal with that?).
I am trying to avoid as much pre-processing outside R as possible,
and I have to process about 500 files with up to 3000 records
each, so I need a more or less automated/batch solution. - so any
string substitution will have to occur in R. But for the moment, I
do not see a reaason for substitution, and the wrapping still occurs.

Cheers
Martin



jim holtman wrote:

You need to supply the actual input line so we can see what is
happening.  Are you sure you do not have unbalanced quotes in
your input (try quote='') or do you have comment characters
(#) in your input?

On Fri, May 29, 2009 at 3:15 PM, Martin Tomko
martin.to...@geo.uzh.ch mailto:martin.to...@geo.uzh.ch
mailto:martin.to...@geo.uzh.ch
mailto:martin.to...@geo.uzh.ch wrote:

   Dear All,
   I am observing a strange behavior and searching the
archives and
   help pages didn't help much.
   I have a csv with a variable number of fields in each line.

   I use
   dataPoints - read.csv(inputFile, head=FALSE, sep=;,fill
=TRUE);

   to read it in, and it works. But - some lines are long and
'wrap',
   or split and continue on the next line. So when I check the
dim of
   the frame, they are not correct and I can see when I do a
printout
   that the lines is split into two in the frame. I checked
the input
   file and all is good.

   an example of the input is:
 37;2175168475;13;8.522729;47.19537;16366...@n00
 ;30;sculpture;bird;tourism;animal;statue;canon;eos;rebel;schweiz;switzerland;eagle;swiss;adler;skulptur;zug;1750;28;tamron;f28;canton;tourismus;vogel;baar;kanton;xti;tamron1750;1750mm;tamron1750mm;400d;rabbitriotnet;

   where the last values occurs on the next line in the data
frame.

   It does not have to be the last value, as in the follwong
example,
   the word kempten starts the next line:
 39;167757703;12;10.309295;47.724545;21903...@n00
 ;36;white;building;tower;clock;clouds;germany;bayern;deutschland;bavaria;europa;europe;eagle;adler;eu;wolke;dome;townhall;rathaus;turm;weiss;allemagne;europeanunion;bundesrepublik;gebaeude;glocke;brd;allgau;kuppel;europ;kempten;niemcy;europo;federalrepublic;europaischeunion;europaeischeunion;germanio;

   What could be the reason?

   I ws thinking about solving the issue by using a different
   separator, that I would use for the first 7 fields and

Re: [R] strange behavior when reading csv - line wraps

2009-05-31 Thread Ted Harding

Ah!!! It was count.fields() which we had overlooked! We discoveered
a work-round which involved using 

  Data0 - readLines(file)

to create a vector of strings, one for each line of the input file,
and then using

  NF - unlist(lapply(R0,function(x)
length(unlist(gregexpr(;,x,fixed=TRUE,useBytes=TRUE))

to count the number of occurrences of ; (the separator) in each line.
(NF+1) produces the same result as count.fields(file,sep=;). 

Thanks for pointing out the existence of count.fields()!
Ted.

On 31-May-09 15:04:23, jim holtman wrote:
 You can do something like this: count the number of fields in each line
 of
 the file and use the max to determine the number of columns for
 read.table:
 
 file - '/tempxx.txt'
 maxFields - max(count.fields(file))  # max
# now setup read.table for max number
 input - read.table(file, colClasses=rep(NA, maxFields), fill=TRUE,
 col.names=paste(V, seq(maxFields), sep=''))
 
 
 On Sun, May 31, 2009 at 6:06 AM, Martin Tomko
 martin.to...@geo.uzh.chwrote:
 
 Dear Jim,
 with the help of Ted, we diagnosed that the cause is in the extreme
 variability in line length during reading in. As the table column
 number is
 apparently determined fro mthe first five lines, what exceeds this
 length
 gets automatically on the next line.
 I am now trying to find a way to read in the data despite this. I have
 no
 control over the table extent, the only thing that would make sense
 according to my data would be to read in a fixed number of columns and
 merge
 all remaining columns as a long string in the last one. No idea how to
 do
 this, though.

 Thanks
 Martin


 jim holtman wrote:

 It is still not clear to me exactly how you want to read the lines
 in.  If
 the lines have a variable number of fields, and some of the lines
 might be
 wrapped, is there some way to determine where the start of each line
 is.
  If you are reading them in with read.csv, then the system is
  assuming
 that each line starts a new row.  If this is not the case, then you
 will
 have to state the rules that determine where the lines start.  You
 can
 always read the data in with 'scan' to separate each line and then do
 whatever processing is required to put together the rows in a data
 frame
 that you want.
  In one of your examples, you indicated that the line was split
  starting
 at the word kempten; if this is in the middle of the line, then you
 would
 have to create the break after reading the line in with 'scan' and
 then
 creating the rows in the dataframe.  All of this can be done in R if
 you can
 state what the criteria is.
 On Sat, May 30, 2009 at 4:32 AM, Martin Tomko
 martin.to...@geo.uzh.chmailto:
 martin.to...@geo.uzh.ch wrote:

Jim,
the two lines I put in are the actual problematic input lines.
In these examples, there are no quotes nor # signs, although I
have no means to make sure they do not occur in the inputs (any
hints how I could deal with that?).
I am trying to avoid as much pre-processing outside R as possible,
and I have to process about 500 files with up to 3000 records
each, so I need a more or less automated/batch solution. - so any
string substitution will have to occur in R. But for the moment, I
do not see a reaason for substitution, and the wrapping still
occurs.

Cheers
Martin



jim holtman wrote:

You need to supply the actual input line so we can see what is
happening.  Are you sure you do not have unbalanced quotes in
your input (try quote='') or do you have comment characters
(#) in your input?

On Fri, May 29, 2009 at 3:15 PM, Martin Tomko
martin.to...@geo.uzh.ch mailto:martin.to...@geo.uzh.ch
mailto:martin.to...@geo.uzh.ch
mailto:martin.to...@geo.uzh.ch wrote:

   Dear All,
   I am observing a strange behavior and searching the
archives and
   help pages didn't help much.
   I have a csv with a variable number of fields in each line.

   I use
   dataPoints - read.csv(inputFile, head=FALSE, sep=;,fill
=TRUE);

   to read it in, and it works. But - some lines are long and
'wrap',
   or split and continue on the next line. So when I check the
dim of
   the frame, they are not correct and I can see when I do a
printout
   that the lines is split into two in the frame. I checked
the input
   file and all is good.

   an example of the input is:
 37;2175168475;13;8.522729;47.19537;16366...@n00
 ;30;sculpture;bird;tourism;animal;statue;canon;eos;rebel;schweiz;switz
 erland;eagle;swiss;adler;skulptur;zug;1750;28;tamron;f28;canton;touris
 mus;vogel;baar;kanton;xti;tamron1750;1750mm;tamron1750mm;400d;rabbitri
 otnet;

   where the last values occurs on the next line in the data
frame.

   It does not have to be the last value, as in the follwong
example,

[R] renaming column names

2009-05-31 Thread Benny Chain

I am trying to rename the column names of a data frame called data. It has
177 columns. I have used :

colnames(data) - a

 

where a is a vector with 177 character names.

 

I don't get any error message, but the column names don't change because
when I then type :

colnames(data)

 

I get the same set of names as before, so the assignment doesnt seem to have
worked.

 

Any ideas or suggestions gratefully received.

 

Benny Chain

Benjamin Chain

Division of Infection and Immunity

Windeyer Building

UCL, 46 Cleveland St.

London W1T 4JF

Fax 00 44 20 7679 9301

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert the contents of a date.frame to a matrix

2009-05-31 Thread David Winsemius



On May 31, 2009, at 10:49 AM, Hongyuan Cao wrote:


x = read.table(hongyuan_5_30_forsafe.txt, sep = \t)


 x = read.table(/Users/davidwinsemius/Downloads/ 
hongyuan_5_30_forsafe.txt, sep = \t)

 str(x)
'data.frame':   7131 obs. of  74 variables:
 $ V1 : Factor w/ 7131 levels ,A28102_at,..: 1 934 120 122 118  
126 124 130 128 134 ...
 $ V2 : Factor w/ 5528 levels ,2-Sep,6-Mar,..: 1 5528 NA NA NA  
NA NA NA NA NA ...
 $ V3 : Factor w/ 2375 levels -1,-10,-100,..: 2374 2375 195 106  
499 2262 287 483 1254 143 ...
 $ V4 : Factor w/ 2326 levels -1,-10,-100,..: 2325 2326 75 533  
1 1442 231 357 299 117 ...


My guess (which became a conclusion after testing) was that you have  
encountered the stringsAsFactors pitfall. (It's probably in the R  
Inferno someplace bu my favorite quote is from Terry Terneau:

.)

Try reading the table in with stringsAsFactors=FALSE or as.is=TRUE.

After doing that and applying the rest of your code, I get:

 head(y)
  X1   X2   X3   X4   X5   X6   X7   X8   X9  X10   
X11  X12  X13  X14  X15  X16  X17
AFFX-BioB-5_at  -214 -139  -76 -135 -106 -138  -72 -4135  -88  
-165  -67  -92 -113 -107 -117 -476
AFFX-BioB-M_at  -153  -73  -49 -114 -125  -85 -144 -260 -127 -105  
-155  -93 -119 -147  -72 -219 -213
AFFX-BioB-3_at   -58   -1 -307  265  -76  215  2387  106   42   
-71   84  -31 -118 -126  -50  -18
AFFX-BioC-5_at88  283  309   12  168   71   55   -2  268  219
82   25  173  243  149  257  301
AFFX-BioC-3_at  -295 -264 -376 -419 -230 -272 -399 -541 -210 -178 -163  
-179 -233 -127 -205 -218 -403
AFFX-BioDn-5_at -558 -400 -650 -585 -284 -558 -551 -790 -535 -246 -430  
-323 -227 -398 -284 -402 -394

.
...snipped the rest of the output.


You replied to me only personally which is against R-help practice, so  
I am adding back the list to the address list.


(You should also configure your maile-client so that it sends plain  
text to the r-help list.)


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] renaming column names

2009-05-31 Thread David Winsemius


Cannot reproduce with a toy example:

 data - data.frame(a=1:3, b=4:6, c=6:8)
 colnames(data) - c(d,e,f)
 colnames(data)
[1] d e f


Perhaps you need to produce more detail. Surely offering the results  
of dput(a) would not tax the limits of the R-mail server.


--
David

On May 31, 2009, at 11:58 AM, Benny Chain wrote:

I am trying to rename the column names of a data frame called  
data. It has

177 columns. I have used :

colnames(data) - a



where a is a vector with 177 character names.



I don't get any error message, but the column names don't change  
because

when I then type :

colnames(data)



I get the same set of names as before, so the assignment doesnt seem  
to have

worked.



Any ideas or suggestions gratefully received.



Benny Chain

Benjamin Chain

Division of Infection and Immunity

Windeyer Building

UCL, 46 Cleveland St.

London W1T 4JF

Fax 00 44 20 7679 9301




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert the contents of a date.frame to a matrix

2009-05-31 Thread David Winsemius



On May 31, 2009, at 12:03 PM, David Winsemius wrote:



On May 31, 2009, at 10:49 AM, Hongyuan Cao wrote:


x = read.table(hongyuan_5_30_forsafe.txt, sep = \t)


 x = read.table(/Users/davidwinsemius/Downloads/ 
hongyuan_5_30_forsafe.txt, sep = \t)

 str(x)
'data.frame':   7131 obs. of  74 variables:
$ V1 : Factor w/ 7131 levels ,A28102_at,..: 1 934 120 122 118  
126 124 130 128 134 ...
$ V2 : Factor w/ 5528 levels ,2-Sep,6-Mar,..: 1 5528 NA NA NA  
NA NA NA NA NA ...
$ V3 : Factor w/ 2375 levels -1,-10,-100,..: 2374 2375 195 106  
499 2262 287 483 1254 143 ...
$ V4 : Factor w/ 2326 levels -1,-10,-100,..: 2325 2326 75 533  
1 1442 231 357 299 117 ...


My guess (which became a conclusion after testing) was that you have  
encountered the stringsAsFactors pitfall. (It's probably in the R  
Inferno someplace but my favorite quote is from Terry Terneau(sic):


The default action of turning every character string into a factor is  
a plague on the S language.

   ---Terry Therneau,  from s-news, 2004


--
David Winsemius
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to set a filter during reading tables

2009-05-31 Thread guox

Thanks, Juliet.
It works for filtering columns.
I am also wondering if there is a way to filter rows.
Thanks again.
-james

 One can use colClasses to set which columns get read in. For the
 columns you don't
 want you can set those to NULL. For example,

 cc - c(NULL,rep(numeric,9))

 myData -
 read.table(myFile.txt,header=TRUE,colClasses=cc,nrow=numRows).


 On Wed, May 27, 2009 at 12:27 PM,  g...@ucalgary.ca wrote:
 We are reading big tables, such as,

 Chemicals -
 read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
 = TRUE, sep = '\t', as.is =T)

 I was wondering if it is possible to set a filter during loading so
 that
 we just load what we want not the whole table each time. Thanks,

 -james

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to set a filter during reading tables

2009-05-31 Thread Linlin Yan

I think you can use readLines(n=1) in loop to skip unwanted rows.

On Mon, Jun 1, 2009 at 12:56 AM,  g...@ucalgary.ca wrote:
 Thanks, Juliet.
 It works for filtering columns.
 I am also wondering if there is a way to filter rows.
 Thanks again.
 -james

 One can use colClasses to set which columns get read in. For the
 columns you don't
 want you can set those to NULL. For example,

 cc - c(NULL,rep(numeric,9))

 myData -
 read.table(myFile.txt,header=TRUE,colClasses=cc,nrow=numRows).


 On Wed, May 27, 2009 at 12:27 PM,  g...@ucalgary.ca wrote:
 We are reading big tables, such as,

 Chemicals -
 read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
 = TRUE, sep = '\t', as.is =T)

 I was wondering if it is possible to set a filter during loading so
 that
 we just load what we want not the whole table each time. Thanks,

 -james

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using chron vector with boxplot

2009-05-31 Thread Kenneth Takagi

Hi,

 

I'm having trouble using dates (created using library(chron)) as
groupings for a boxplot.  I have 10 repeat measurements of a variable
within an individual day.  The measurements were done over 10 days.  I
would like to plot the measurements as a box and wisker plot (using
boxplot or something similar) where the days (as chron object) would be
the grouping and the repeated measurements of the variable make up the
box and wisker plot.  I would like the spacing of the individual
boxplots to reflect the time between measurements:

 

Example using random numbers:

 

### Create date vector of measurement dates

library(chron)

time=c(39083, 39085, 39095, 39096, 39103, 39104, 39105, 39110, 39113,
39120);

orig =chron(01/01/1900)

date = orig + time-2;

 

###  Data for B and W plot

data = data.frame();

mean=rnorm(10, mean=0, sd=1);

for(i in 1:10){data[1:10,i] =rnorm(10, mean=mean[i], sd=1)};

 

### Plot

boxplot(data, range=0);  # works, but doesn't reflect the different time
intervals between measurements dates!

boxplot(data~date, range=0) # tried using formula, gives error:

Error in model.frame.default(formula = data ~ date) : 

 invalid type (list) for variable 'data'

 

Not sure what to try next.  Any suggestions?

 

Thanks,

ken

 

kat...@psu.edu

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] IP-Address

2009-05-31 Thread Henrik Bengtsson

library(gsubfn)
library(gtools)
library(rbenchmark)

n - 1
df - data.frame(
  a = rnorm(n),
  b = rnorm(n),
  c = rnorm(n),
  ip = replicate(n, paste(sample(255, 4), collapse='.'), simplify=TRUE)
)

res - benchmark(columns=c('test', 'elapsed'), replications=10, order=NULL,
  peda = {
connection - textConnection(as.character(df$ip))
o - do.call(order, read.table(connection, sep='.'))
close(connection)
df[o, ]
  },

  peda2 = {
connection - textConnection(as.character(df$ip))
dfT - read.table(connection, sep='.', colClasses=rep(integer,
4), quote=, na.strings=NULL, blank.lines.skip=FALSE)
close(connection)
o - do.call(order, dfT)
df[o, ]
  },

  hb = {
ip - strsplit(as.character(df$ip), split=., fixed=TRUE)
ip - unlist(ip, use.names=FALSE)
ip - as.integer(ip)
dim(ip) - c(4, nrow(df))
ip - 256^3*ip[1,] + 256^2*ip[2,] + 256*ip[3,] + ip[4,]
o - order(ip)
df[o, ]
  },

  hb2 = {
ip - strsplit(as.character(df$ip), split=., fixed=TRUE)
ip - unlist(ip, use.names=FALSE)
ip - as.integer(ip);
dim(ip) - c(4, nrow(df))
o - sort.list(ip[4,], method=radix, na.last=TRUE)
for (kk in 3:1) {
  o - o[sort.list(ip[kk,o], method=radix, na.last=TRUE)]
}
df[o, ]
  }
)

print(res)

   test elapsed
1  peda4.12
2 peda24.08
3hb0.28
4   hb20.25


On Sun, May 31, 2009 at 12:42 AM, Wacek Kusnierczyk
waclaw.marcin.kusnierc...@idi.ntnu.no wrote:
 edwin Sendjaja wrote:
 Hi VQ,

 Thank you. It works like charm. But I think Peter's code is faster. What is 
 the difference?


 i think peter's code is more r-elegant, though less generic.  here's a
 quick test, with not so surprising results.  gsubfn is implemented in r,
 not c, and it is painfully slow in this test. i also added gabor's
 suggestion.

    library(gsubfn)
    library(gtools)
    library(rbenchmark)

    n = 1000
    df = data.frame(
       a=rnorm(n),
       b = rnorm(n),
       c = rnorm(n),
       ip = replicate(n, paste(sample(255, 4), collapse='.'),
 simplify=TRUE))
    benchmark(columns=c('test', 'elapsed'), replications=10, order=NULL,
       peda={
          connection = textConnection(as.character(df$ip))
          o = do.call(order, read.table(connection, sep='.'))
          close(connection)
          df[o, ] },
       waku=df[order(gsubfn(perl=TRUE,
          '[0-9]+',
          ~ sprintf('%03d', as.integer(x)),
          as.character(df$ip))), ],
       gagr=df[mixedorder(df$ip), ] )

    # peda 0.070
    # waku 7.070
    # gagr 4.710


 vQ

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to set a filter during reading tables

2009-05-31 Thread Gabor Grothendieck

The sqldf package can read a subset of rows and columns into R without
reading the entire file into R.  There are a few caveats:

- It does not support ftp so you will need to download the file to your
  computer first as shown in the example below
- since value is an SQL keyword it turns value into value__1 to avoid
a collision.
- you will have to convert the value column to numeric yourself as shown:

library(sqldf)
download.file(ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals;,
Chemicals.txt, method = wget)

# define wp as a file with indicated format
wp - file(Chemicals.txt)
attr(wp, file.format) - list(sep = \t, header = TRUE)

# use sqldf to read it in keeping only indicated rows
wp.df - sqldf(select * from wp where footnote_codes = 'p' and period = 'M01')

# fix up type of value__1
wp.df$value__1 - as.numeric(as.character(wp.df$value__1))

head(wp.df)

See http://sqldf.googlecode.com



On Wed, May 27, 2009 at 12:27 PM,  g...@ucalgary.ca wrote:
 We are reading big tables, such as,

 Chemicals -
 read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
 = TRUE, sep = '\t', as.is =T)

 I was wondering if it is possible to set a filter during loading so that
 we just load what we want not the whole table each time. Thanks,

 -james

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using chron vector with boxplot

2009-05-31 Thread Gabor Grothendieck

Try this:

  boxplot(as.matrix(data) ~ as.Date(date), cex.axis = 0.5, las = 2)

or if all the dates are in the same year and month as they are here
then you might want to just display the day of the month:

  boxplot(as.matrix(data) ~ month.day.year(date)$day)


On Sun, May 31, 2009 at 1:45 PM, Kenneth Takagi kat...@psu.edu wrote:
 Hi,



 I'm having trouble using dates (created using library(chron)) as
 groupings for a boxplot.  I have 10 repeat measurements of a variable
 within an individual day.  The measurements were done over 10 days.  I
 would like to plot the measurements as a box and wisker plot (using
 boxplot or something similar) where the days (as chron object) would be
 the grouping and the repeated measurements of the variable make up the
 box and wisker plot.  I would like the spacing of the individual
 boxplots to reflect the time between measurements:



 Example using random numbers:



 ### Create date vector of measurement dates

 library(chron)

 time=c(39083, 39085, 39095, 39096, 39103, 39104, 39105, 39110, 39113,
 39120);

 orig =chron(01/01/1900)

 date = orig + time-2;



 ###  Data for B and W plot

 data = data.frame();

 mean=rnorm(10, mean=0, sd=1);

 for(i in 1:10){data[1:10,i] =rnorm(10, mean=mean[i], sd=1)};



 ### Plot

 boxplot(data, range=0);  # works, but doesn't reflect the different time
 intervals between measurements dates!

 boxplot(data~date, range=0) # tried using formula, gives error:

            Error in model.frame.default(formula = data ~ date) :

             invalid type (list) for variable 'data'



 Not sure what to try next.  Any suggestions?



 Thanks,

 ken



 kat...@psu.edu




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to set a filter during reading tables

2009-05-31 Thread guox

Since there are many rows, using read.table we spent too much on reading
in rows that we do not want. We are wondering if there is a way to read
only rows that we are interested in. Thanks,

-james
 I think you can use readLines(n=1) in loop to skip unwanted rows.

 On Mon, Jun 1, 2009 at 12:56 AM,  g...@ucalgary.ca wrote:
 Thanks, Juliet.
 It works for filtering columns.
 I am also wondering if there is a way to filter rows.
 Thanks again.
 -james

 One can use colClasses to set which columns get read in. For the
 columns you don't
 want you can set those to NULL. For example,

 cc - c(NULL,rep(numeric,9))

 myData -
 read.table(myFile.txt,header=TRUE,colClasses=cc,nrow=numRows).


 On Wed, May 27, 2009 at 12:27 PM, Â g...@ucalgary.ca wrote:
 We are reading big tables, such as,

 Chemicals -
 read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
 = TRUE, sep = '\t', as.is =T)

 I was wondering if it is possible to set a filter during loading so
 that
 we just load what we want not the whole table each time. Thanks,

 -james

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a simple trick to get autoclose parenthesis on windows

2009-05-31 Thread Jose Quesada

Gabor Grothendieck wrote:
 Hi, I was looking for your sendcodetoR.ahk autohotkey script
 but couldn't find it.  Would you be able to point me to it.
 Thanks.
   
Here's what I have right now, it also maps right alt to send code to R.
You should change the ahk_class to your editor of choice (here it's
vim). To get the ahk_class, you need to run the atoit_spy (comes with
every autohotkey install).

HTH,
-Jose

; F3 and right alt will send selection to open Rgui
#WinActivateForce
F3::
IfWinExist, ahk_class Rgui
{
Send ^c; copy selection to clipboard
WinActivate ; R Console
save = %clipboard%
sendinput, {Raw}%save%
sendinput,{enter}
Sleep 1000
WinActivate, ahk_class Vim
}

Ralt::
IfWinExist, ahk_class Rgui
{
Send ^c; copy selection to clipboard
WinActivate ; R Console
save = %clipboard%
sendinput, {Raw}%save%
sendinput,{enter}
Sleep 1000
WinActivate, ahk_class Vim
}
return


-- 
Jose Quesada, PhD.
Max Planck Institute,
Center for Adaptive Behavior and Cognition -ABC-, 
Lentzeallee 94, office 224, 14195 Berlin
http://www.josequesada.name/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] renaming column names

2009-05-31 Thread Benny Chain

Dear David

Problem solved ! Remarkably, I saw your previous post about stringsasFactors
- false and that was the root of the problem. I was trying to rename the
columns according to contents of one of the rows of the data frame, which
were being interpreted as factors (which I hadn't realised) and that's why
the assignment as new column names didn't work !

Thanks for your help, albeit inadvertent !

Benny



Benjamin Chain
Division of Infection and Immunity
Windeyer Building
UCL, 46 Cleveland St.
London W1T 4JF
Fax 00 44 20 7679 9301


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: 31 May 2009 17:11
To: Benny Chain
Cc: r-help@r-project.org
Subject: Re: [R] renaming column names

Cannot reproduce with a toy example:

  data - data.frame(a=1:3, b=4:6, c=6:8)
  colnames(data) - c(d,e,f)
  colnames(data)
[1] d e f


Perhaps you need to produce more detail. Surely offering the results  
of dput(a) would not tax the limits of the R-mail server.

-- 
David

On May 31, 2009, at 11:58 AM, Benny Chain wrote:

 I am trying to rename the column names of a data frame called  
 data. It has
 177 columns. I have used :

 colnames(data) - a



 where a is a vector with 177 character names.



 I don't get any error message, but the column names don't change  
 because
 when I then type :

 colnames(data)



 I get the same set of names as before, so the assignment doesnt seem  
 to have
 worked.



 Any ideas or suggestions gratefully received.



 Benny Chain

 Benjamin Chain

 Division of Infection and Immunity

 Windeyer Building

 UCL, 46 Cleveland St.

 London W1T 4JF

 Fax 00 44 20 7679 9301




   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a simple trick to get autoclose parenthesis on windows

2009-05-31 Thread Gabor Grothendieck

Thanks very much.  Regards.

On Sun, May 31, 2009 at 2:20 PM, Jose Quesada ques...@gmail.com wrote:
 Gabor Grothendieck wrote:
 Hi, I was looking for your sendcodetoR.ahk autohotkey script
 but couldn't find it.  Would you be able to point me to it.
 Thanks.

 Here's what I have right now, it also maps right alt to send code to R.
 You should change the ahk_class to your editor of choice (here it's
 vim). To get the ahk_class, you need to run the atoit_spy (comes with
 every autohotkey install).

 HTH,
 -Jose

 ; F3 and right alt will send selection to open Rgui
 #WinActivateForce
    F3::
    IfWinExist, ahk_class Rgui
    {
        Send ^c                    ; copy selection to clipboard
        WinActivate ; R Console
        save = %clipboard%
        sendinput, {Raw}%save%
        sendinput,{enter}
        Sleep 1000
        WinActivate, ahk_class Vim
    }

    Ralt::
    IfWinExist, ahk_class Rgui
    {
        Send ^c                    ; copy selection to clipboard
            WinActivate ; R Console
            save = %clipboard%
            sendinput, {Raw}%save%
            sendinput,{enter}
        Sleep 1000
            WinActivate, ahk_class Vim
    }
 return


 --
 Jose Quesada, PhD.
 Max Planck Institute,
 Center for Adaptive Behavior and Cognition -ABC-,
 Lentzeallee 94, office 224, 14195 Berlin
 http://www.josequesada.name/



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can I Compile R with MS Access 2003 Developer Extensions?

2009-05-31 Thread Felipe Carrillo


Hi:
I have an application that uses MS Access as front end interacting with 
Excel,Word,Sigmaplot and R in the background. I use the programs to do 
different tasks but I mainly use R to create graphics on my Access forms 
without having to manually open R.
To make this short, I was wondering if R can be compiled from Access using the 
Developer Extensions. I know that the MS office programs can be compiled with 
it but don't know if it can be done including R. 


Felipe D. Carrillo  
Supervisory Fishery Biologist  
Department of the Interior  
US Fish  Wildlife Service  
California, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can I Compile R with MS Access 2003 Developer Extensions?

2009-05-31 Thread Felipe Carrillo


Hi:
I have an application that uses MS Access as front end interacting with 
Excel,Word,Sigmaplot and R in the background. I use the programs to do 
different tasks but I mainly use R to create graphics on my Access forms 
without having to manually open R.
To make this short, I was wondering if R can be compiled from Access using the 
Developer Extensions. I know that the MS office programs can be compiled with 
it but don't know if it can be done including R. 


Felipe D. Carrillo  
Supervisory Fishery Biologist  
Department of the Interior  
US Fish  Wildlife Service  
California, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I Compile R with MS Access 2003 Developer Extensions?

2009-05-31 Thread Emmanuel Charpentier

Le dimanche 31 mai 2009 à 12:08 -0700, Felipe Carrillo a écrit :
 Hi:
 I have an application that uses MS Access as front end interacting
 with Excel,Word,Sigmaplot and R in the background. I use the programs
 to do different tasks but I mainly use R to create graphics on my
 Access forms without having to manually open R.
 To make this short, I was wondering if R can be compiled from Access
 using the Developer Extensions. I know that the MS office programs can
 be compiled with it but don't know if it can be done including R. 

Somehow, I strongly doubt it : R compilation needs tools totally alien
to Microsoft's world vision.

However, ISTR that there exists some sort of R-WinWorld bridge. Have a
look here : http://cran.miroir-francais.fr/contrib/extra/dcom/

Note that I can't vouch for it myself : I didn't touch a Windows
computer for years...

HTH,

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange behavior when reading csv - line wraps

2009-05-31 Thread Martin Tomko


Big thanks to Ted and Jim for all the help.
Martin

(Ted Harding) wrote:

Ah!!! It was count.fields() which we had overlooked! We discoveered
a work-round which involved using 


  Data0 - readLines(file)

to create a vector of strings, one for each line of the input file,
and then using

  NF - unlist(lapply(R0,function(x)
length(unlist(gregexpr(;,x,fixed=TRUE,useBytes=TRUE))

to count the number of occurrences of ; (the separator) in each line.
(NF+1) produces the same result as count.fields(file,sep=;). 


Thanks for pointing out the existence of count.fields()!
Ted.

On 31-May-09 15:04:23, jim holtman wrote:
  

You can do something like this: count the number of fields in each line
of
the file and use the max to determine the number of columns for
read.table:

file - '/tempxx.txt'
maxFields - max(count.fields(file))  # max
# now setup read.table for max number
input - read.table(file, colClasses=rep(NA, maxFields), fill=TRUE,
col.names=paste(V, seq(maxFields), sep=''))


On Sun, May 31, 2009 at 6:06 AM, Martin Tomko
martin.to...@geo.uzh.chwrote:



Dear Jim,
with the help of Ted, we diagnosed that the cause is in the extreme
variability in line length during reading in. As the table column
number is
apparently determined fro mthe first five lines, what exceeds this
length
gets automatically on the next line.
I am now trying to find a way to read in the data despite this. I have
no
control over the table extent, the only thing that would make sense
according to my data would be to read in a fixed number of columns and
merge
all remaining columns as a long string in the last one. No idea how to
do
this, though.

Thanks
Martin


jim holtman wrote:

  

It is still not clear to me exactly how you want to read the lines
in.  If
the lines have a variable number of fields, and some of the lines
might be
wrapped, is there some way to determine where the start of each line
is.
 If you are reading them in with read.csv, then the system is
 assuming
that each line starts a new row.  If this is not the case, then you
will
have to state the rules that determine where the lines start.  You
can
always read the data in with 'scan' to separate each line and then do
whatever processing is required to put together the rows in a data
frame
that you want.
 In one of your examples, you indicated that the line was split
 starting
at the word kempten; if this is in the middle of the line, then you
would
have to create the break after reading the line in with 'scan' and
then
creating the rows in the dataframe.  All of this can be done in R if
you can
state what the criteria is.
On Sat, May 30, 2009 at 4:32 AM, Martin Tomko
martin.to...@geo.uzh.chmailto:
martin.to...@geo.uzh.ch wrote:

   Jim,
   the two lines I put in are the actual problematic input lines.
   In these examples, there are no quotes nor # signs, although I
   have no means to make sure they do not occur in the inputs (any
   hints how I could deal with that?).
   I am trying to avoid as much pre-processing outside R as possible,
   and I have to process about 500 files with up to 3000 records
   each, so I need a more or less automated/batch solution. - so any
   string substitution will have to occur in R. But for the moment, I
   do not see a reaason for substitution, and the wrapping still
   occurs.

   Cheers
   Martin



   jim holtman wrote:

   You need to supply the actual input line so we can see what is
   happening.  Are you sure you do not have unbalanced quotes in
   your input (try quote='') or do you have comment characters
   (#) in your input?

   On Fri, May 29, 2009 at 3:15 PM, Martin Tomko
   martin.to...@geo.uzh.ch mailto:martin.to...@geo.uzh.ch
   mailto:martin.to...@geo.uzh.ch
   mailto:martin.to...@geo.uzh.ch wrote:

  Dear All,
  I am observing a strange behavior and searching the
   archives and
  help pages didn't help much.
  I have a csv with a variable number of fields in each line.

  I use
  dataPoints - read.csv(inputFile, head=FALSE, sep=;,fill
   =TRUE);

  to read it in, and it works. But - some lines are long and
   'wrap',
  or split and continue on the next line. So when I check the
   dim of
  the frame, they are not correct and I can see when I do a
   printout
  that the lines is split into two in the frame. I checked
   the input
  file and all is good.

  an example of the input is:
37;2175168475;13;8.522729;47.19537;16366...@n00
;30;sculpture;bird;tourism;animal;statue;canon;eos;rebel;schweiz;switz
erland;eagle;swiss;adler;skulptur;zug;1750;28;tamron;f28;canton;touris
mus;vogel;baar;kanton;xti;tamron1750;1750mm;tamron1750mm;400d;rabbitri
otnet;

  where the last values occurs on the next line in the data
   frame.

  It does not have to be the last value, as in the follwong
   example,
  the

Re: [R] IP-Address

2009-05-31 Thread Wacek Kusnierczyk

wow! :)

vQ

Henrik Bengtsson wrote:
 library(gsubfn)
 library(gtools)
 library(rbenchmark)

 n - 1
 df - data.frame(
   a = rnorm(n),
   b = rnorm(n),
   c = rnorm(n),
   ip = replicate(n, paste(sample(255, 4), collapse='.'), simplify=TRUE)
 )

 res - benchmark(columns=c('test', 'elapsed'), replications=10, order=NULL,
   peda = {
 connection - textConnection(as.character(df$ip))
 o - do.call(order, read.table(connection, sep='.'))
 close(connection)
 df[o, ]
   },

   peda2 = {
 connection - textConnection(as.character(df$ip))
 dfT - read.table(connection, sep='.', colClasses=rep(integer,
 4), quote=, na.strings=NULL, blank.lines.skip=FALSE)
 close(connection)
 o - do.call(order, dfT)
 df[o, ]
   },

   hb = {
 ip - strsplit(as.character(df$ip), split=., fixed=TRUE)
 ip - unlist(ip, use.names=FALSE)
 ip - as.integer(ip)
 dim(ip) - c(4, nrow(df))
 ip - 256^3*ip[1,] + 256^2*ip[2,] + 256*ip[3,] + ip[4,]
 o - order(ip)
 df[o, ]
   },

   hb2 = {
 ip - strsplit(as.character(df$ip), split=., fixed=TRUE)
 ip - unlist(ip, use.names=FALSE)
 ip - as.integer(ip);
 dim(ip) - c(4, nrow(df))
 o - sort.list(ip[4,], method=radix, na.last=TRUE)
 for (kk in 3:1) {
   o - o[sort.list(ip[kk,o], method=radix, na.last=TRUE)]
 }
 df[o, ]
   }
 )

 print(res)

test elapsed
 1  peda4.12
 2 peda24.08
 3hb0.28
 4   hb20.25


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] warning message when running quantile regression

2009-05-31 Thread jude.ryan

Hi All,

 

I am running quantile regression in a for loop starting with 1
variable and adding a variable at a time reaching a maximum of 20
variables.

I get the following warning messages after my for loop runs. Should I
be concerned about these messages? I am building predictive models and
am not interested in inference.

 

Warning messages:

1: In summary.rq(quantreg.emaff) : 3 non-positive fis   - I don't
understand this message - is this a cause for concern?

2: In summary.rq(quantreg.emaff) : 3 non-positive fis

3: In summary.rq(quantreg.emaff) : 5 non-positive fis

4: In rq.fit.br(x, y, tau = tau, ...) : Solution may be nonunique

5: In summary.rq(quantreg.emaff) : 6 non-positive fis

6: In summary.rq(quantreg.emaff) : 5 non-positive fis

7: In summary.rq(quantreg.emaff) : 5 non-positive fis

8: In summary.rq(quantreg.emaff) : 7 non-positive fis

9: In summary.rq(quantreg.emaff) : 10 non-positive fis

10: In summary.rq(quantreg.emaff) : 9 non-positive fis

11: In summary.rq(quantreg.emaff) : 8 non-positive fis

12: In summary.rq(quantreg.emaff) : 9 non-positive fis

13: In summary.rq(quantreg.emaff) : 8 non-positive fis

14: In summary.rq(quantreg.emaff) : 11 non-positive fis

 

I understand the non-unique solution message.

 

Thanks in advance,

 

Jude Ryan

 

___
Jude Ryan
Director, Client Analytical Services
Strategy  Business Development
UBS Financial Services Inc.
1200 Harbor Boulevard, 4th Floor
Weehawken, NJ 07086-6791
Tel. 201-352-1935
Fax 201-272-2914
Email: jude.r...@ubs.com



Please do not transmit orders or instructions regarding a UBS 
account electronically, including but not limited to e-mail, 
fax, text or instant messaging. The information provided in 
this e-mail or any attachments is not an official transaction 
confirmation or account statement. For your protection, do not 
include account numbers, Social Security numbers, credit card 
numbers, passwords or other non-public information in your e-mail. 
Because the information contained in this message may be privileged, 
confidential, proprietary or otherwise protected from disclosure, 
please notify us immediately by replying to this message and 
deleting it from your computer if you have received this 
communication in error. Thank you. 

UBS Financial Services Inc. 
UBS International Inc. 
UBS Financial Services Incorporated of Puerto Rico 
UBS AG

 
UBS reserves the right to retain all messages. Messages are protected
and accessed only in legally justified cases.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with reshaping of data

2009-05-31 Thread Christian Schmitt


Hi,



i have to reshape a dataset. my data have the following format:



ID; x1; x2; x3; x4; v1; ... v20

1; 0.1; 0.3; 0.4; 0.2; 2; ... 3

2; 0.3; 0.7; 0.1; 0.2; 1; ... 4
...
999; 0.9; 0.6; 0.3; 0.1; 4; ... 2
1000; 0.2; 0.6; 0.7; 0.8; 1; ... 5



ID is the number of persons (here 1000 persons)

x are descriptive variables

and v1-v20 are values of satisfaction on holidays for 20 days



and now i should reshape the data such that each record contains only one 
participiant on one day



could somebody help me? i don't know how to do this?!



thank you

_
[[elided Hotmail spam]]
[[elided Hotmail spam]]
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error:non-numeric argument in my function

2009-05-31 Thread Grześ


Hello!
I have a function:

zywnoscCalosc- function( jedzenie, n1, n2, n3, n4, d1, d2, d3, d4 ) { 

ndf - data.frame(nn1=n1,nn2=n2,nn3=n3,nn4=n4)
ddf - data.frame(dd1=d1,dd2=d2,dd3=d3,dd4=d4)
for (i in 1:length(n1)){

wekt_n = ndf[i,]

wekt_n_ok = wekt_n[!is.na(wekt_n)]


dl_n = length(wekt_n_ok)
wynik = (1*wekt_n_ok)/(1*dl_n)
}
}

and I get an error like this:
Error in 1 * wekt_n_ok : non-numeric argument to binary operator

Anybody can help me?
-- 
View this message in context: 
http://www.nabble.com/Error%3Anon-numeric-argument-in-my-function-tp23807218p23807218.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error:non-numeric argument in my function

2009-05-31 Thread jim holtman

Message is very clear:

 1 * 'a'
Error in 1 * a : non-numeric argument to binary operator
wekt_n_ok  must be non-numeric.  You need to provide a reproducible
script.  You also need to learn about debugging.  'str(wekt_n_ok )' when the
error occurred may have helped to pinpoint the problem.
On Sun, May 31, 2009 at 5:43 PM, Grze¶ gregori...@gmail.com wrote:


 Hello!
 I have a function:

 zywnoscCalosc- function( jedzenie, n1, n2, n3, n4, d1, d2, d3, d4 ) {

 ndf - data.frame(nn1=n1,nn2=n2,nn3=n3,nn4=n4)
 ddf - data.frame(dd1=d1,dd2=d2,dd3=d3,dd4=d4)
 for (i in 1:length(n1)){

 wekt_n = ndf[i,]

 wekt_n_ok = wekt_n[!is.na(wekt_n)]


 dl_n = length(wekt_n_ok)
 wynik = (1*wekt_n_ok)/(1*dl_n)
 }
 }

 and I get an error like this:
 Error in 1 * wekt_n_ok : non-numeric argument to binary operator

 Anybody can help me?
 --
 View this message in context:
 http://www.nabble.com/Error%3Anon-numeric-argument-in-my-function-tp23807218p23807218.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with reshaping of data

2009-05-31 Thread jim holtman

 x - data.frame(id=1:100, x1=sample(100), x2=sample(100), s1=1, s2=2,
s3=3, s4=4)
 require(reshape)
 z - melt(x, id=1:3)  # first three columns are ids
 head(z,10)  # should be what you want
   id x1 x2 variable value
1   1 97 43   s1 1
2   2 89  5   s1 1
3   3 59 97   s1 1
4   4 91 86   s1 1
5   5 11 73   s1 1
6   6 27 79   s1 1
7   7 19 85   s1 1
8   8 24 26   s1 1
9   9 10 30   s1 1
10 10 38 67   s1 1



On Sun, May 31, 2009 at 2:06 PM, Christian Schmitt 
christian_...@hotmail.com wrote:


 Hi,



 i have to reshape a dataset. my data have the following format:



 ID; x1; x2; x3; x4; v1; ... v20

 1; 0.1; 0.3; 0.4; 0.2; 2; ... 3

 2; 0.3; 0.7; 0.1; 0.2; 1; ... 4
 ...
 999; 0.9; 0.6; 0.3; 0.1; 4; ... 2
 1000; 0.2; 0.6; 0.7; 0.8; 1; ... 5



 ID is the number of persons (here 1000 persons)

 x are descriptive variables

 and v1-v20 are values of satisfaction on holidays for 20 days



 and now i should reshape the data such that each record contains only one
 participiant on one day



 could somebody help me? i don't know how to do this?!



 thank you

 _
 [[elided Hotmail spam]]
 [[elided Hotmail spam]]
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with reshaping of data

2009-05-31 Thread jim holtman

a shorter example:

 n - 4
 x - data.frame(id=1:n, x1=sample(n), x2=sample(n), s1=1, s2=2, s3=3,
s4=4)
 require(reshape)
 z - melt(x, id=1:3)  # first three columns are ids
 x
  id x1 x2 s1 s2 s3 s4
1  1  3  1  1  2  3  4
2  2  4  2  1  2  3  4
3  3  1  4  1  2  3  4
4  4  2  3  1  2  3  4
 z  # should be what you want
   id x1 x2 variable value
1   1  3  1   s1 1
2   2  4  2   s1 1
3   3  1  4   s1 1
4   4  2  3   s1 1
5   1  3  1   s2 2
6   2  4  2   s2 2
7   3  1  4   s2 2
8   4  2  3   s2 2
9   1  3  1   s3 3
10  2  4  2   s3 3
11  3  1  4   s3 3
12  4  2  3   s3 3
13  1  3  1   s4 4
14  2  4  2   s4 4
15  3  1  4   s4 4
16  4  2  3   s4 4



On Sun, May 31, 2009 at 2:06 PM, Christian Schmitt 
christian_...@hotmail.com wrote:


 Hi,



 i have to reshape a dataset. my data have the following format:



 ID; x1; x2; x3; x4; v1; ... v20

 1; 0.1; 0.3; 0.4; 0.2; 2; ... 3

 2; 0.3; 0.7; 0.1; 0.2; 1; ... 4
 ...
 999; 0.9; 0.6; 0.3; 0.1; 4; ... 2
 1000; 0.2; 0.6; 0.7; 0.8; 1; ... 5



 ID is the number of persons (here 1000 persons)

 x are descriptive variables

 and v1-v20 are values of satisfaction on holidays for 20 days



 and now i should reshape the data such that each record contains only one
 participiant on one day



 could somebody help me? i don't know how to do this?!



 thank you

 _
 [[elided Hotmail spam]]
 [[elided Hotmail spam]]
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] IP-Address

2009-05-31 Thread Henrik Bengtsson

Not really, just the old saying that any piece of code can be made
twice as fast (which often holds true recursively). /Henrik

On Sun, May 31, 2009 at 1:58 PM, Wacek Kusnierczyk
waclaw.marcin.kusnierc...@idi.ntnu.no wrote:
 wow! :)

 vQ

 Henrik Bengtsson wrote:
 library(gsubfn)
 library(gtools)
 library(rbenchmark)

 n - 1
 df - data.frame(
   a = rnorm(n),
   b = rnorm(n),
   c = rnorm(n),
   ip = replicate(n, paste(sample(255, 4), collapse='.'), simplify=TRUE)
 )

 res - benchmark(columns=c('test', 'elapsed'), replications=10, order=NULL,
   peda = {
     connection - textConnection(as.character(df$ip))
     o - do.call(order, read.table(connection, sep='.'))
     close(connection)
     df[o, ]
   },

   peda2 = {
     connection - textConnection(as.character(df$ip))
     dfT - read.table(connection, sep='.', colClasses=rep(integer,
 4), quote=, na.strings=NULL, blank.lines.skip=FALSE)
     close(connection)
     o - do.call(order, dfT)
     df[o, ]
   },

   hb = {
     ip - strsplit(as.character(df$ip), split=., fixed=TRUE)
     ip - unlist(ip, use.names=FALSE)
     ip - as.integer(ip)
     dim(ip) - c(4, nrow(df))
     ip - 256^3*ip[1,] + 256^2*ip[2,] + 256*ip[3,] + ip[4,]
     o - order(ip)
     df[o, ]
   },

   hb2 = {
     ip - strsplit(as.character(df$ip), split=., fixed=TRUE)
     ip - unlist(ip, use.names=FALSE)
     ip - as.integer(ip);
     dim(ip) - c(4, nrow(df))
     o - sort.list(ip[4,], method=radix, na.last=TRUE)
     for (kk in 3:1) {
       o - o[sort.list(ip[kk,o], method=radix, na.last=TRUE)]
     }
     df[o, ]
   }
 )

 print(res)

    test elapsed
 1  peda    4.12
 2 peda2    4.08
 3    hb    0.28
 4   hb2    0.25


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to set a filter during reading tables

2009-05-31 Thread Juliet Hannah

There are several things you can tell read.table to make it faster.

First, as mentioned, setting colClasses helps. I think telling read.table how
many rows and columns there are also helps.

When this was not sufficient,  I've had to do the data processing
using Python, Perl, or awk.

If that had not been convenient I would have tried the sqldf solution that was
mentioned.

That covers all the options I'm familiar with. I'm also curious about other ways
to selectively read in rows in R. Let me know what ends up working.



On Sun, May 31, 2009 at 2:17 PM,  g...@ucalgary.ca wrote:
 Since there are many rows, using read.table we spent too much on reading
 in rows that we do not want. We are wondering if there is a way to read
 only rows that we are interested in. Thanks,

 -james
 I think you can use readLines(n=1) in loop to skip unwanted rows.

 On Mon, Jun 1, 2009 at 12:56 AM,  g...@ucalgary.ca wrote:
 Thanks, Juliet.
 It works for filtering columns.
 I am also wondering if there is a way to filter rows.
 Thanks again.
 -james

 One can use colClasses to set which columns get read in. For the
 columns you don't
 want you can set those to NULL. For example,

 cc - c(NULL,rep(numeric,9))

 myData -
 read.table(myFile.txt,header=TRUE,colClasses=cc,nrow=numRows).


 On Wed, May 27, 2009 at 12:27 PM,  g...@ucalgary.ca wrote:
 We are reading big tables, such as,

 Chemicals -
 read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
 = TRUE, sep = '\t', as.is =T)

 I was wondering if it is possible to set a filter during loading so
 that
 we just load what we want not the whole table each time. Thanks,

 -james

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'options=utils::recover' not working in .Rprofile or within R

2009-05-31 Thread David Winsemius



On May 31, 2009, at 12:22 AM, Duncan Murdoch wrote:


David Winsemius wrote:

You are wiping out all of the default options with that approach.



Actually, I think it hid the options() function.


Since I was doing this at the RGui and became concerned that it  
appeared I no longer had any options, I restarted.  Could I have saved  
time by just executing rm(options)?




Try (after restarting R to get the other options back to what they   
should be):


op=options()   # so you can reset back to baseline
options(error=utils::recover)  # do not think the utils:: is needed



Not if you run it in the console, but it is needed in .Rprofile.


Because it might be executed before the loading of the default packages?

--
David

 Saving the old option could be done as

olderror - options(error=utils::recover)

Duncan Murdoch

 my.func - function(x){
  y - x + 12
  nonsense
  y
  }

 my.func(14)
Error in my.func(14) : object nonsense not found

Enter a frame number, or 0 to exit

1: my.func(14)

Selection:



On May 30, 2009, at 10:24 PM, Mark Kimpel wrote:



Duncan,

I've pared down my .Rprofile so that it has just the options  
line,  started R
from terminal (instead of using ESS-emacs) and I still have the   
problem. Am
I specifying the options incorrectly? I believe I took this  
directly  from

the help page.



Not what the examples look like on my machine.



See my output of .Rprofile, the code example that doesn't
work as we think it ought, and my sessionInfo().  Thanks, Mark

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.



read.table(~/.Rprofile)


V1
1 options=utils::recover


my.func - function(x){


+ y - x + 12
+ nonsense
+ y
+ }


my.func(14)


Error in my.func(14) : object 'nonsense' not found


sessionInfo()


R version 2.9.0 (2009-04-17)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE = en_US .UTF -8 ;LC_NUMERIC = C ;LC_TIME = en_US .UTF  
-8 ;LC_COLLATE = en_US .UTF -8 ;LC_MONETARY = C ;LC_MESSAGES =  
en_US .UTF -8 ;LC_PAPER = en_US .UTF -8 ;LC_NAME = C ;LC_ADDRESS  
=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C


attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



David Winsemius, MD
Heritage Laboratories
West Hartford, CT





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error:non-numeric argument in my function

2009-05-31 Thread Stavros Macrakis

On Sun, May 31, 2009 at 6:10 PM, jim holtman jholt...@gmail.com wrote:

 Message is very clear:

  1 * 'a'
 Error in 1 * a : non-numeric argument to binary operator


Though the user should have been able to figure this out, perhaps the error
message could be improved? After all, it is not the fact that the operator
is *binary* that implies that its argument must be numeric, but that it is
*arithmetic*. The binary operator %in%, for example, takes non-numeric
arguments.

Suggested replacement error message:

 non-numeric argument to arithmetic operator

   -s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error:non-numeric argument in my function

2009-05-31 Thread Grześ


Thanks jholtman!
But I'm not sure what and where I should change my code... :(
wekt_n = ndf[i,]
wekt_n_ok = wekt_n[!is.na(wekt_n)]
If before this line I should change wekt_n_ok  as numeric? but if I wrote 
as.numeric(wekt_n_ok) it's also not correct  

and  I also don't understand why wekt_n_ok is not numeric?

-- 
View this message in context: 
http://www.nabble.com/Error%3Anon-numeric-argument-in-my-function-tp23807218p23807873.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grid.edit() for ggplot2

2009-05-31 Thread Gabor Grothendieck

If you enter the code in this post first:
https://stat.ethz.ch/pipermail/r-help/2009-May/198791.html

then this post shows an example of how to do it with lattice:
https://stat.ethz.ch/pipermail/r-help/2009-May/199146.html

but I think there is a bug in grid since similar code does
not seem to work with your example of grid graphics
generated by ggplot2.

On Sun, May 31, 2009 at 9:01 AM, baptiste auguie ba...@exeter.ac.uk wrote:
 Dear all,


 I'm trying to access and modify grobs in a ggplot2 plot. The basic idea for
 raw Grid objects I understand from Paul Murrell's R graphics book, or this
 page of examples,

 http://www.stat.auckland.ac.nz/~paul/grid/copygrob/copygrobs.R

 However I can't figure out how to apply this to a ggplot (basically I don't
 know how to write a syntactically correct gPath),


 p - # minimal example
 qplot(0,0)+ annotate(text,0,0,label=test)

 g - # store the plot as a grob
 ggplotGrob(p)

 # structure of the grob
 grid.ls(g) # rather large!

 # find a particular grob in the gTree
 getGrob(g,texts, grep = T)


 # next step, modify, say, the colour of these grobs
 grid.edit() # what do I put in here?


 Thanks for any piece of advice,

 baptiste

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error:non-numeric argument in my function

2009-05-31 Thread jim holtman

What you need to do is to see that is in 'wekt_n_ok' at that point in your
program. You should be able to see it with str(wekt_n_ok) when the error
occurs. You can also add a print statement in the loop to print out the
value before it is used with the binary operator.

It would help if you provided 'str' of the various objects you are using, or
maybe some reproducible code. Somehow you are creating a non-numeric value
in your code; you need to debug it and check all the values before you use
them.

On Sun, May 31, 2009 at 7:05 PM, Grze¶ gregori...@gmail.com wrote:

Thanks jholtman!
But I'm not sure what and where I should change my code... :(
wekt_n = ndf[i,]
wekt_n_ok = wekt_n[!is.na(wekt_n)]
If before this line I should change wekt_n_ok as numeric? but if I wrote
as.numeric(wekt_n_ok) it's also not correct

and I also don't understand why wekt_n_ok is not numeric?

--
View this message in context:
http://www.nabble.com/Error%3Anon-numeric-argument-in-my-function-tp23807218p23807873.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2009-05-31 Thread Roslina Zakaria

Hi R-users,

I try to use sn package but it give me the following message:

 install.packages(repos=NULL,pkgs=c:\\Tinn-R\\sn_0.4-12.zip)
Warning: package 'sn' is in use and will not be installed
updating HTML package descriptions

I did tried a few time to save the .zip file but it give me the same error 
message.

Thank you so much for any help given.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2009-05-31 Thread Ronggui Huang

use _search()_ to see if the package is on the search path. If yes,
use _detach(package:sn,unload=TRUE)_ to detach it and then try to
install it again.

Ronggui

2009/6/1 Roslina Zakaria zrosl...@yahoo.com:
 Hi R-users,

 I try to use sn package but it give me the following message:

 install.packages(repos=NULL,pkgs=c:\\Tinn-R\\sn_0.4-12.zip)
 Warning: package 'sn' is in use and will not be installed
 updating HTML package descriptions

 I did tried a few time to save the .zip file but it give me the same error 
 message.

 Thank you so much for any help given.



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
HUANG Ronggui, Wincent
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'options=utils::recover' not working in .Rprofile or within R

2009-05-31 Thread Mark Kimpel

options(error=utils::recover)

Does indeed work, at least with the new install of R-devel (to be 2.10.0)
that I am running right now. I was sure I checked this with 2.9.0 last
night, but I am probably mistaken.

One point, the ?options help page is misleading in that the example is 
Note that these need to
  specified as e.g. 'options=utils::recover' in startup files
  such as '.Rprofile'.

Since the use of utils:: is a new requirement, I think stemming from when
utils is loaded, this help page should be corrected as the example is
confusing/incorrect.

So, stick with what is in the first line above and, for now, ignore the help
page.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Sat, May 30, 2009 at 10:49 PM, David Winsemius dwinsem...@comcast.netwrote:

 You are wiping out all of the default options with that approach.

 Try (after restarting R to get the other options back to what they should
 be):

 op=options()   # so you can reset back to baseline
 options(error=utils::recover)  # do not think the utils:: is needed
  my.func - function(x){
  y - x + 12
  nonsense
  y
  }

  my.func(14)
 Error in my.func(14) : object nonsense not found

 Enter a frame number, or 0 to exit

 1: my.func(14)

 Selection:



 On May 30, 2009, at 10:24 PM, Mark Kimpel wrote:

  Duncan,

 I've pared down my .Rprofile so that it has just the options line, started
 R
 from terminal (instead of using ESS-emacs) and I still have the problem.
 Am
 I specifying the options incorrectly? I believe I took this directly from
 the help page.


 Not what the examples look like on my machine.


  See my output of .Rprofile, the code example that doesn't
 work as we think it ought, and my sessionInfo().  Thanks, Mark

 Type 'demo()' for some demos, 'help()' for on-line help, or
 'help.start()' for an HTML browser interface to help.
 Type 'q()' to quit R.

  read.table(~/.Rprofile)

 V1
 1 options=utils::recover

 my.func - function(x){

 + y - x + 12
 + nonsense
 + y
 + }

 my.func(14)

 Error in my.func(14) : object 'nonsense' not found

 sessionInfo()

 R version 2.9.0 (2009-04-17)
 x86_64-unknown-linux-gnu

 locale:

 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base




 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'options=utils::recover' not working in .Rprofile or within R

2009-05-31 Thread Duncan Murdoch


David Winsemius wrote:

On May 31, 2009, at 12:22 AM, Duncan Murdoch wrote:

  

David Winsemius wrote:


You are wiping out all of the default options with that approach.


  

Actually, I think it hid the options() function.



Since I was doing this at the RGui and became concerned that it  
appeared I no longer had any options, I restarted.  Could I have saved  
time by just executing rm(options)?
  


Yes.
  
Try (after restarting R to get the other options back to what they   
should be):


op=options()   # so you can reset back to baseline
options(error=utils::recover)  # do not think the utils:: is needed

  

Not if you run it in the console, but it is needed in .Rprofile.



Because it might be executed before the loading of the default packages?
  
I think ?Startup documents it to do just that.  I didn't check the 
actual source or test it, but usually the docs are right, even if 
sometimes they aren't.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'options=utils::recover' not working in .Rprofile or within R

2009-05-31 Thread Duncan Murdoch


Mark Kimpel wrote:

options(error=utils::recover)

Does indeed work, at least with the new install of R-devel (to be 2.10.0)
that I am running right now. I was sure I checked this with 2.9.0 last
night, but I am probably mistaken.

One point, the ?options help page is misleading in that the example is 
Note that these need to
  specified as e.g. 'options=utils::recover' in startup files
  such as '.Rprofile'.
  


Yes, thanks, I'll fix that.

Duncan Murdoch

Since the use of utils:: is a new requirement, I think stemming from when
utils is loaded, this help page should be corrected as the example is
confusing/incorrect.

So, stick with what is in the first line above and, for now, ignore the help
page.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Sat, May 30, 2009 at 10:49 PM, David Winsemius dwinsem...@comcast.netwrote:

  

You are wiping out all of the default options with that approach.

Try (after restarting R to get the other options back to what they should
be):

op=options()   # so you can reset back to baseline
options(error=utils::recover)  # do not think the utils:: is needed
 my.func - function(x){
 y - x + 12
 nonsense
 y
 }

 my.func(14)
Error in my.func(14) : object nonsense not found

Enter a frame number, or 0 to exit

1: my.func(14)

Selection:



On May 30, 2009, at 10:24 PM, Mark Kimpel wrote:

 Duncan,


I've pared down my .Rprofile so that it has just the options line, started
R
from terminal (instead of using ESS-emacs) and I still have the problem.
Am
I specifying the options incorrectly? I believe I took this directly from
the help page.

  

Not what the examples look like on my machine.


 See my output of .Rprofile, the code example that doesn't


work as we think it ought, and my sessionInfo().  Thanks, Mark

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 read.table(~/.Rprofile)
  
V1

1 options=utils::recover

  

my.func - function(x){



+ y - x + 12
+ nonsense
+ y
+ }

  

my.func(14)



Error in my.func(14) : object 'nonsense' not found

  

sessionInfo()



R version 2.9.0 (2009-04-17)
x86_64-unknown-linux-gnu

locale:

LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

  


David Winsemius, MD
Heritage Laboratories
West Hartford, CT








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sweave:Figures from plot (LME output) not getting generated (pdf or eps)

2009-05-31 Thread Girish A.R.

Hi,

I seem to be facing a strange problem when I use Sweave for creating a
LaTeX document of the R lme() output --- The EPS and PDF figure files
get created, but are empty. I have attached a reproducible example
below (taken from the R lme() help example).


\documentclass[a4paper,10pt]{article}
\usepackage{Sweave}
\SweaveOpts{keep.source=TRUE}
\begin{document}

=
fm1 - lme(distance ~ age, Orthodont, random = ~ age | Subject)
@

fig=TRUE=
plot(fm1, distance ~ fitted(.) | Subject, abline = c(0,1))
@

\end{document}



I don't seem to face this problem while plotting other objects. Any
help is appreciated.

Thanks,

-Girish


===
 sessionInfo()
R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.
1252;LC_MONETARY=English_United States.
1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] grid  splines   stats graphics  grDevices utils
datasets  methods
[9] base

other attached packages:
 [1] RWinEdt_1.8-1  coda_0.13-4verification_1.29
CircStats_0.2-3
 [5] boot_1.2-36fields_5.02spam_0.15-4
waveslim_1.6.1
 [9] lmtest_0.9-23  zoo_1.5-5  psychometric_2.1
multilevel_2.3
[13] MASS_7.2-46nlme_3.1-92languageR_0.953
lme4_0.999375-31
[17] Matrix_0.999375-26 zipfR_0.6-5lattice_0.17-22
gplots_2.7.0
[21] caTools_1.9bitops_1.0-4.1 gdata_2.4.2
gtools_2.5.0-1
[25] gmodels_2.14.1 ggplot2_0.8.3  reshape_0.8.3
plyr_0.1.8
[29] proto_0.3-8doBy_3.9   foreign_0.8-34
car_1.2-14
[33] Design_2.1-2   survival_2.35-4Hmisc_3.6-0

loaded via a namespace (and not attached):
[1] cluster_1.12.0   Formula_0.1-3kinship_1.1.0-22
plm_1.1-2sandwich_2.2-1
[6] tools_2.9.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in truncgof package?

2009-05-31 Thread Duncan Murdoch


Carlos J. Gil Bellosta wrote:

Dear R-helpers,

I was testing the truncgof CRAN package, found something that looked
like a bug, and did my job: contacted the maintainer. But he did not
reply, so I am resending my query here.

I installed package truncgof and run the example for function ad.test. I
got the following output:

set.seed(123)
treshold - 10
xc  - rlnorm(100, 2, 2)# complete sample
xt - xc[xc = treshold]# left truncated sample
ad.test(xt, plnorm, list(meanlog = 2, sdlog = 2), H = 10)


Supremum Class Anderson-Darling Test

data:  xt 
AD = 3.124, p-value = 0.12
alternative hypothesis: two.sided 


treshold = 10, simulations: 100


So I cannot reject the hipothesis (at a standard confidence level) that
the original sample comes from a lognormal distribution (as it is the
case).

But let us try to iterate on this example:

set.seed( 123 )
treshold - 10

foo - function(){
  xc  - rlnorm(100, 2, 2) # complete sample
  xt - xc[xc = treshold] # left truncated sample
  ks.test(xt, plnorm, list(meanlog = 2, sdlog = 2), H =
10)$p.value
}

results - replicate( 100, foo() )


Then:

  

table( results )


results
   0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09  0.1 0.11 0.16 0.18
0.19  0.2 
  257931234112211
32 
0.21 0.22 0.26 0.27 0.28  0.3 0.31 0.32 0.33 0.36 0.38  0.4 0.44 0.49
0.54 0.55 
   22131211121211
21 
0.56 0.57 0.62  0.7 0.76 0.78 0.96 0.98 
   12111111 



This is, in a 45% of the cases, you would reject the H_0 hypothesis,
which happens to be true, at the 5% standard confidence level.
  


That looks to me that the test as implemented is not very good.  This 
could be an implementation bug, but it could also be a limitation of the 
test itself.  I don't know the theory underlying this particular test, 
but a way to determine it is in implementation bug is to carefully 
implement the test and see if you got the same answer. 


Duncan Murdoch


Do you think this behaviour is buggy? If so, given that the maintainer
does not seem to be contactable, what would be the next step to take?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

58 matches

Mail list logo