Re: [R] radial.plot(plotrix) - plotting multiple polygons?

2009-01-16 Thread Jim Lemon

Stefan Uhmann wrote:

Dear List, Dear Jim,

is it possible to draw multiple polygons with different line types? 
lty=c or line.lty=c do not work with radial.plot (in the matrix case) 
as well as add=TRUE.

Hi Stefan,
As this is the second request for multiple line characteristics this 
week, I suppose I should get to work on it. I'll see if I can solve it 
without breaking any code that is already written using radial.plot.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Value Lookup from File without Slurping

2009-01-16 Thread Gundala Viswanath
Dear all,

I have a repository file (let's call it repo.txt)
 that contain two columns like this:

# tag  value
AAA0.2
AAT0.3
AAC   0.02
AAG   0.02
ATA0.3
ATT   0.7

Given another query vector

 qr - c(AAC, ATT)

I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length  30 bp),  and my PC memory is too small to keep it.

- Gundala Viswanath
Jakarta - Indonesia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Carlos J. Gil Bellosta
On Fri, 2009-01-16 at 18:02 +0900, Gundala Viswanath wrote:
 Dear all,
 
 I have a repository file (let's call it repo.txt)
  that contain two columns like this:
 
 # tag  value
 AAA0.2
 AAT0.3
 AAC   0.02
 AAG   0.02
 ATA0.3
 ATT   0.7
 
 Given another query vector
 
  qr - c(AAC, ATT)
 
 I would like to find the corresponding value for each query above,
 yielding:
 
 0.02
 0.7
 
 However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
 Is there any ways to do that?
 
 The reason I want to do that because repo.txt is very2 large size
 (milions of lines,
 with tag length  30 bp),  and my PC memory is too small to keep it.
 
 - Gundala Viswanath
 Jakarta - Indonesia

Hello,

You can always store your repo.txt into a database, say, SQLite, and
select only the values you want via an SQL query.

Thus, you will prevent loading the full file into memory.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Wacek Kusnierczyk
you might try to iteratively read a limited number of line of lines in a
batch using readLines:

# filename, the name of your file
# n, the maximal count of lines to read in a batch
connection = file(filename, open=rt)
while (length(lines - readLines(con=connection, n=n))) {
   # do your stuff here
}
close(connection)

?file
?readLines

vQ


Gundala Viswanath wrote:
 Dear all,

 I have a repository file (let's call it repo.txt)
  that contain two columns like this:

 # tag  value
 AAA0.2
 AAT0.3
 AAC   0.02
 AAG   0.02
 ATA0.3
 ATT   0.7

 Given another query vector

   
 qr - c(AAC, ATT)
 

 I would like to find the corresponding value for each query above,
 yielding:

 0.02
 0.7

 However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
 Is there any ways to do that?

 The reason I want to do that because repo.txt is very2 large size
 (milions of lines,
 with tag length  30 bp),  and my PC memory is too small to keep it.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading data from Excel Spread sheet

2009-01-16 Thread venkata kirankumar
Hi all,

I tried to read data from Excel spread sheet with using

read.csv(file.choose())
and
read.delim(file.choose())
but its showing *ÐÏ.à.*.

and also i tried with
read.table(file.choose())

then its showing*  V1
1 ÐÏ\021ࡱ*   


can any one suggest how to read data from Excel Spread sheet

thanks  regards;

kiran

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Gabor Grothendieck
The sqldf package can read a large file to a database without going
through R followed by extracting it.   The package makes it easier
to use RSQLite by setting up the database for you and after extracting
the portion you want removing the database automatically.  You can
specify all this in two lines: one to name the file and one to specify
the extraction using SQL. See the examples in example 6 on the
home page:

http://sqldf.googecode.com#Example_6._File_Input

On Fri, Jan 16, 2009 at 4:12 AM, Carlos J. Gil Bellosta
c...@datanalytics.com wrote:
 On Fri, 2009-01-16 at 18:02 +0900, Gundala Viswanath wrote:
 Dear all,

 I have a repository file (let's call it repo.txt)
  that contain two columns like this:

 # tag  value
 AAA0.2
 AAT0.3
 AAC   0.02
 AAG   0.02
 ATA0.3
 ATT   0.7

 Given another query vector

  qr - c(AAC, ATT)

 I would like to find the corresponding value for each query above,
 yielding:

 0.02
 0.7

 However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
 Is there any ways to do that?

 The reason I want to do that because repo.txt is very2 large size
 (milions of lines,
 with tag length  30 bp),  and my PC memory is too small to keep it.

 - Gundala Viswanath
 Jakarta - Indonesia

 Hello,

 You can always store your repo.txt into a database, say, SQLite, and
 select only the values you want via an SQL query.

 Thus, you will prevent loading the full file into memory.

 Best regards,

 Carlos J. Gil Bellosta
 http://www.datanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data from Excel Spread sheet

2009-01-16 Thread Thomas Schwander

Hi,

http://tolstoy.newcastle.edu.au/R/e2/help/06/12/6702.html

Cheers,
Thomas

venkata kirankumar schrieb:

Hi all,

I tried to read data from Excel spread sheet with using

read.csv(file.choose())
and
read.delim(file.choose())
but its showing *ÐÏ.à.*.

and also i tried with
read.table(file.choose())

then its showing*  V1
1 ÐÏ\021ࡱ*   


can any one suggest how to read data from Excel Spread sheet

thanks  regards;

kiran

[[alternative HTML version deleted]]

  



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data from Excel Spread sheet

2009-01-16 Thread Prof Brian Ripley
See the 'R Data Import/Export' manual (and please study the posting 
guide, which asked you to check the manuals before posting).


On Fri, 16 Jan 2009, venkata kirankumar wrote:


Hi all,

I tried to read data from Excel spread sheet with using

read.csv(file.choose())
and
read.delim(file.choose())
but its showing *??.??*.

and also i tried with
read.table(file.choose())

then its showing*  V1
   1 ??\021ࡱ*   


can any one suggest how to read data from Excel Spread sheet

thanks  regards;

kiran

[[alternative HTML version deleted]]




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] faster version of split()?

2009-01-16 Thread Simon Pickett

Hi all,

I want to calculate the number of unique observations of y in each level 
of x from my data frame df.


this does the job but it is very slow for this big data frame (159503 rows, 
11 columns).


group.list - split(df$y,df$x)
count - function(x) length(unique(na.omit(x)))
sapply(group.list, count, USE.NAMES=TRUE)

I couldnt find the answer searching for slow split and split time on 
help forum.


I am running R version 2.2.1, on a machine with 4gb of memory and I'm using 
windows 2000.


thanks in advance,

Simon.







- Original Message - 
From: Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no

To: Gundala Viswanath gunda...@gmail.com
Cc: R help r-h...@stat.math.ethz.ch
Sent: Friday, January 16, 2009 9:30 AM
Subject: Re: [R] Value Lookup from File without Slurping



you might try to iteratively read a limited number of line of lines in a
batch using readLines:

# filename, the name of your file
# n, the maximal count of lines to read in a batch
connection = file(filename, open=rt)
while (length(lines - readLines(con=connection, n=n))) {
  # do your stuff here
}
close(connection)

?file
?readLines

vQ


Gundala Viswanath wrote:

Dear all,

I have a repository file (let's call it repo.txt)
 that contain two columns like this:

# tag  value
AAA0.2
AAT0.3
AAC   0.02
AAG   0.02
ATA0.3
ATT   0.7

Given another query vector



qr - c(AAC, ATT)



I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object (e.g. 
hash).

Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length  30 bp),  and my PC memory is too small to keep it.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread r...@quantide.com

Something like this should work

library(R.utils)
out = numeric()
qr = c(AAC, ATT)
n =countLines(test.txt)
file = file(test.txt, r)
for (i in 1:n){
line = readLines(file, n = 1)
A = strsplit (line, split =  )[[1]][1]
if(is.element(A, qr)) {
value = as.numeric(strsplit (line, split =  )[[1]][2])
out = c(out, value)
}
}

You may want to improve execution speed by reading data in chunks 
instead of line by line. Code requires a little modification





Carlos J. Gil Bellosta wrote:

On Fri, 2009-01-16 at 18:02 +0900, Gundala Viswanath wrote:
  

Dear all,

I have a repository file (let's call it repo.txt)
 that contain two columns like this:

# tag  value
AAA0.2
AAT0.3
AAC   0.02
AAG   0.02
ATA0.3
ATT   0.7

Given another query vector



qr - c(AAC, ATT)
  

I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length  30 bp),  and my PC memory is too small to keep it.

- Gundala Viswanath
Jakarta - Indonesia



Hello,

You can always store your repo.txt into a database, say, SQLite, and
select only the values you want via an SQL query.

Thus, you will prevent loading the full file into memory.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] function return output

2009-01-16 Thread threshold

Hi, I wrote the function which outputs a matrix 'c' and a single value 'd',
as follows (simplified example):
procedure - function(a,b){
...
list(c,d)
}
now I want to use 'c' and 'd' in code as follows:
d - matrix(0,1,1)
value - procedure(a,b)
and d[1,1] - value[2] breaks telling that:
Error in d[1, 1] : incorrect number of dimensions
What I did wrong??, best, robert



-- 
View this message in context: 
http://www.nabble.com/function-return-output-tp21496413p21496413.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faster version of split()?

2009-01-16 Thread Henrique Dallazuanna
Maybe:

with(df, tapply(y, x, count))


On Fri, Jan 16, 2009 at 8:10 AM, Simon Pickett simon.pick...@bto.orgwrote:

 Hi all,

 I want to calculate the number of unique observations of y in each level
 of x from my data frame df.

 this does the job but it is very slow for this big data frame (159503 rows,
 11 columns).

 group.list - split(df$y,df$x)
 count - function(x) length(unique(na.omit(x)))
 sapply(group.list, count, USE.NAMES=TRUE)

 I couldnt find the answer searching for slow split and split time on
 help forum.

 I am running R version 2.2.1, on a machine with 4gb of memory and I'm using
 windows 2000.

 thanks in advance,

 Simon.







 - Original Message - From: Wacek Kusnierczyk 
 waclaw.marcin.kusnierc...@idi.ntnu.no
 To: Gundala Viswanath gunda...@gmail.com
 Cc: R help r-h...@stat.math.ethz.ch
 Sent: Friday, January 16, 2009 9:30 AM
 Subject: Re: [R] Value Lookup from File without Slurping


  you might try to iteratively read a limited number of line of lines in a
 batch using readLines:

 # filename, the name of your file
 # n, the maximal count of lines to read in a batch
 connection = file(filename, open=rt)
 while (length(lines - readLines(con=connection, n=n))) {
  # do your stuff here
 }
 close(connection)

 ?file
 ?readLines

 vQ


 Gundala Viswanath wrote:

 Dear all,

 I have a repository file (let's call it repo.txt)
  that contain two columns like this:

 # tag  value
 AAA0.2
 AAT0.3
 AAC   0.02
 AAG   0.02
 ATA0.3
 ATT   0.7

 Given another query vector


  qr - c(AAC, ATT)


 I would like to find the corresponding value for each query above,
 yielding:

 0.02
 0.7

 However, I want to avoid slurping whole repo.txt into an object (e.g.
 hash).
 Is there any ways to do that?

 The reason I want to do that because repo.txt is very2 large size
 (milions of lines,
 with tag length  30 bp),  and my PC memory is too small to keep it.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Smooth periodic splines

2009-01-16 Thread cmr.p...@gmail.com
Hello group!

Is there a package that allows to fit smooth *periodic* splines to
data? I'm interested in a function which combines the functionality of
smooth.spline and splines::periodicSpline.

Thanks,
Andrey

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faster version of split()?

2009-01-16 Thread r...@quantide.com
df = data.frame(x = sample(7:9, 100, rep = T), y = sample(1:5, 100, rep 
= T))

fun = function(x){length(unique(x))}
by(df$x, df$y, fun)


Simon Pickett wrote:

Hi all,

I want to calculate the number of unique observations of y in each 
level of x from my data frame df.


this does the job but it is very slow for this big data frame (159503 
rows, 11 columns).


group.list - split(df$y,df$x)
count - function(x) length(unique(na.omit(x)))
sapply(group.list, count, USE.NAMES=TRUE)

I couldnt find the answer searching for slow split and split time 
on help forum.


I am running R version 2.2.1, on a machine with 4gb of memory and I'm 
using windows 2000.


thanks in advance,

Simon.







- Original Message - From: Wacek Kusnierczyk 
waclaw.marcin.kusnierc...@idi.ntnu.no

To: Gundala Viswanath gunda...@gmail.com
Cc: R help r-h...@stat.math.ethz.ch
Sent: Friday, January 16, 2009 9:30 AM
Subject: Re: [R] Value Lookup from File without Slurping



you might try to iteratively read a limited number of line of lines in a
batch using readLines:

# filename, the name of your file
# n, the maximal count of lines to read in a batch
connection = file(filename, open=rt)
while (length(lines - readLines(con=connection, n=n))) {
# do your stuff here
}
close(connection)

?file
?readLines

vQ


Gundala Viswanath wrote:

Dear all,

I have a repository file (let's call it repo.txt)
that contain two columns like this:

# tag value
AAA 0.2
AAT 0.3
AAC 0.02
AAG 0.02
ATA 0.3
ATT 0.7

Given another query vector



qr - c(AAC, ATT)



I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object 
(e.g. hash).

Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length  30 bp), and my PC memory is too small to keep it.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting of lognormal distribution to lower tail experimental data

2009-01-16 Thread Mattias Brännström
Hi,

I am beginner with R and need firm guidance with my problem. I have seen
some other threads discussing the subject of right censored data, but I am
not sure whether or not this problem can be regarded as such.

Data:
I have a vector with laboratory test data (strength of wood specimens,
example attached as txt-file). This data is the full sample. It is a
common view that this kind of data follows a lognormal distribution.

Background:
When fitting a distribution to the lower tail, it will usually be very
different compared to fitting the whole data. The lower tail COV is the
decisive measure in my analysis (due to resistance estimations of
buildings).

Problem:
I would like to fit a lognormal distribution to the 10%-lower tail of the
attached data.

Question:
Which function would you recommend me to use, and how to formulate it in R
using the attached data?


Best regards,
Mattias Brännström

PhD student
Luleå Technical University14.8104060497551
16.0803849713769
17.4968110450967
20.1117010953086
21.6845345194074
22.1964808750696
22.6873957765245
22.7739430649247
23.9264878434242
24.0222175448625
24.2589089305143
25.0916485959638
25.6616713982665
26.42372554061
27.1110230291713
27.5506997537485
27.7723436488626
28.5526342456366
28.5937291162256
28.9858602338711
29.0613508454069
29.0909808586811
29.2550164645902
29.2592989298793
29.6618509042981
29.9425395348144
30.2713842375835
30.3578677416218
30.3691588410138
30.3828682984843
30.4226905469563
30.5067979753883
30.5374335721223
30.5377759972106
30.5388417498134
30.6197685055474
30.7562765495333
30.859344040433
31.0055304357074
31.1385745478821
31.2613609610913
31.3581765472267
31.3786215131359
31.3862882622892
31.5134036026533
31.5518777553211
31.6665459342876
31.6966977981316
31.7570612273693
31.7921492877195
31.8916639064893
32.0060685276589
32.0175653961169
32.0957893864849
32.2300707106859
32.2332967187291
32.2933752614657
32.3576534230736
32.4077953094961
32.440737957141
32.4555840754853
32.5778411557745
32.8123917532014
32.8938018752775
32.9005162471121
33.07109861048
33.2169544508162
33.2528421050785
33.2538040913417
33.2629253254612
33.5137399745455
33.646430499
33.6635730620378
33.8081149852229
33.9678615607469
34.0229870935982
34.1334960727572
34.2686698621255
34.281848082171
34.2835695816904
34.285140119695
34.3103279775051
34.3231318200957
34.3298117057378
34.3580807993554
34.375123159033
34.4358460611865
34.6628865380987
34.6882398181505
34.7138670512661
34.8319635103069
34.8543564286965
34.8740064707112
34.8831091812053
35.0783682108864
35.1589202345914
35.1940609286179
35.1963400807163
35.2225030547279
35.2948450801441
35.469980224983
35.4834158010586
35.5066986045073
35.50715377932
35.5581064872091
35.5929578196659
35.6202992874957
35.8503591228778
35.8864712453884
35.9236676029501
36.0272009858509
36.0582893169656
36.0697828249579
36.0733810205568
36.0796120864424
36.1046602505238
36.2146409925046
36.2439056981677
36.2519350403298
36.322995147923
36.3325951679294
36.3998442944923
36.4825302220394
36.5356445443208
36.542777089046
36.5472978834097
36.6042330833268
36.6859461526364
36.7276360969522
36.7559303592816
36.7634229191255
36.763717634632
36.7941371199203
36.7963035045816
37.0158687639306
37.0348436474432
37.0682601112198
37.0753839210413
37.1072512130614
37.218572715278
37.2365135026929
37.308997089496
37.4740533868667
37.5061839013662
37.5920632597573
37.5985612515559
37.6442491468094
37.7885711256696
37.8031769603556
37.8044638170924
37.8758257907878
37.9492755772546
37.9684449085749
38.1925855742105
38.3276830837628
38.36856145824
38.425438354474
38.4871152468451
38.5850141197061
38.7021862984193
38.7281733035293
38.7906053310133
38.8167873845123
38.879602998118
38.9257736530842
39.1219500124099
39.126766061353
39.1572739123381
39.1854685896754
39.2465242986911
39.2681394600736
39.3020749787304
39.3491652552564
39.3519847700339
39.4481343103317
39.4641601623938
39.495558462879
39.5276929723755
39.5743412754252
39.641904265762
39.6805873230154
39.7328651773131
39.7425791919387
39.7862127456679
39.7862858916594
39.8029935150385
39.8184701961103
39.83254032456
39.860678129295
39.9503278266256
39.9590265687971
39.9757707281345
40.0615365250434
40.1029961842667
40.1057284653796
40.2691606972348
40.2757445303632
40.3168035819963
40.3293541033457
40.5323828990285
40.5968897591303
40.6751634982263
40.6988406405023
40.7068328640941
40.7675314086004
40.7776623368738
40.7908603255974
40.7955349906672
40.8035933414247
40.9383315440048
40.9558208833268
40.9606594573753
40.9784894747295
41.0046771495095
41.0177042749474
41.1282624024752
41.1359777151937
41.1980177115138
41.2528202892208
41.4075724424903
41.4116661876018
41.4467994208966
41.4552126570036
41.4568329047393
41.6507621039357
41.7203545510097
41.7436507425138
41.7725271219265
41.8136233981782
41.8338507732199
41.9178452756984
41.9378524875993
41.9424929587518
42.0160174497851
42.0327031030869
42.0695865194719
42.1050583647999
42.154742564893
42.2913255282883

Re: [R] snow and different R versions

2009-01-16 Thread Gábor Csárdi
Just for the records. The problem was that the R version Rscript
starts is determined at compile time, and I had to move my R
installation to another place for some technical reasons. At the
compile-time place there was another R version and Rscript started
that one.

The solution was to create a dummy rscript file that sets up the RHOME
environment variable for the real Rscript:
-
#! /bin/sh
export RHOME=/home/gabor/software/lib64/R
/home/gabor/software/lib64/R/bin/Rscript $@
-
and then use this file with snow, via the setDefaultClusterOptions
function or some other way.

Gabor

On Tue, Jan 13, 2009 at 1:40 AM,  l...@stat.uiowa.edu wrote:
 As far as I can tell looking at the code and running on my stytem this
 should use the one in rscript if you are starting via makeCluster or
 makeMPIcluster. You might double check by doing debug(makeMPIcluster)
 and stepping through and looking at what is uses in the call to
 mpi.comm.spawn for mpitask and args.

 luke

 On Fri, 9 Jan 2009, Gábor Csárdi wrote:

 Dear Luke and others,

 I have many R versions on my machine and want to start a particular
 one when snow builds its cluster. (The same version I start snow
 from.) It seems that everything is set up correctly in
 defaultClusterOptions:

 mget(ls(defaultClusterOptions), defaultClusterOptions)

 $homogeneous
 [1] TRUE

 $manual
 [1] FALSE

 $master
 nodename
 maya.unil.ch

 $outfile
 [1] /dev/null

 $port
 [1] 10187

 $rhome
   R_HOME  sessionInfo()
 R version 2.8.0 (2008-10-20)
 x86_64-redhat-linux-gnu

 locale:

 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] snow_0.3-3

 /home/gabor/software/lib64/R

 $rlibs
R_LIBS
 /usr/lib64/R/library:/usr/share/R/library

 $rprog
 [1] /home/gabor/software/lib64/R/bin/R

 $rscript
 [1] /home/gabor/software/lib64/R/bin/Rscript

 $rshcmd
 [1] ssh

 $scriptdir
 [1] /home/gabor/.R/library/snow

 $snowlib
 [1] /home/gabor/.R/library

 $timeout
 [1] 31536000

 $type
 [1] MPI

 $user
  user
 gabor

 $useRscript
 [1] TRUE

 but snow still starts a different version, the one in /usr/bin/R. Is
 this a bug? If not, how can I tell snow to start the same version, the
 one that is listed in defaultClusterOptions?

 Thanks,
 Gabor

 sessionInfo()

 R version 2.8.0 (2008-10-20)
 x86_64-redhat-linux-gnu

 locale:

 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] snow_0.3-3




 --
 Luke Tierney
 Chair, Statistics and Actuarial Science
 Ralph E. Wareham Professor of Mathematical Sciences
 University of Iowa  Phone: 319-335-3386
 Department of Statistics andFax:   319-335-3017
   Actuarial Science
 241 Schaeffer Hall  email:  l...@stat.uiowa.edu
 Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu



-- 
Gabor Csardi gabor.csa...@unil.ch UNIL DGM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Wacek Kusnierczyk
if the file is really large, reading it twice may add considerable penalty:

r...@quantide.com wrote:
 Something like this should work

 library(R.utils)
 out = numeric()
 qr = c(AAC, ATT)
 n =countLines(test.txt)

# 1st pass

 file = file(test.txt, r)
 for (i in 1:n){

# 2nd pass

 line = readLines(file, n = 1)
 A = strsplit (line, split =  )[[1]][1]
 if(is.element(A, qr)) {
 value = as.numeric(strsplit (line, split =  )[[1]][2])
 out = c(out, value)
 }
 }

if this is a one-go task, counting the lines does not pay, and why
bother.  if this is a repetitive task, a database-based solution will
probably be a better idea.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread r...@quantide.com

I agree on the database solution.
Database are the rigth tool to solve this kind of problem.
Only consider the start up cost of setting up the database. This could 
be a very time consuming task if someone is not familiar with database 
technology.


Using file() is not a real reading of all the file. This function will 
simply open a connection to the file without reading it.

countLines should do something lile wc -l from a bash shell

I would say that if this is a one time job this solution should work 
even thought is not the fastest. In case this job is a repetitive one, 
then a database solution is surely better


A.


Wacek Kusnierczyk wrote:

if the file is really large, reading it twice may add considerable penalty:

r...@quantide.com wrote:
  

Something like this should work

library(R.utils)
out = numeric()
qr = c(AAC, ATT)
n =countLines(test.txt)



# 1st pass

  

file = file(test.txt, r)
for (i in 1:n){



# 2nd pass

  

line = readLines(file, n = 1)
A = strsplit (line, split =  )[[1]][1]
if(is.element(A, qr)) {
value = as.numeric(strsplit (line, split =  )[[1]][2])
out = c(out, value)
}
}



if this is a one-go task, counting the lines does not pay, and why
bother.  if this is a repetitive task, a database-based solution will
probably be a better idea.

vQ




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faster version of split()?

2009-01-16 Thread Søren Højsgaard
Hi,

R version 2.2.1 is slightly old. You may want to upgrade to the current 
version, R.2.8.1!!! 

You can for example do

library(doBy)
dd - data.frame(x=c(1,1,1,2,2,2), y=c(1,1,2, 1,1,1))
summaryBy(y~x, data=dd, FUN=function(x)length(unique(x)))
 
Regards
Søren


-Oprindelig meddelelse-
Fra: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] På 
vegne af Simon Pickett
Sendt: 16. januar 2009 11:10
Til: R help
Emne: [R] faster version of split()?

Hi all,

I want to calculate the number of unique observations of y in each level of 
x from my data frame df.

this does the job but it is very slow for this big data frame (159503 rows,
11 columns).

group.list - split(df$y,df$x)
count - function(x) length(unique(na.omit(x))) sapply(group.list, count, 
USE.NAMES=TRUE)

I couldnt find the answer searching for slow split and split time on help 
forum.

I am running R version 2.2.1, on a machine with 4gb of memory and I'm using 
windows 2000.

thanks in advance,

Simon.







- Original Message -
From: Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no
To: Gundala Viswanath gunda...@gmail.com
Cc: R help r-h...@stat.math.ethz.ch
Sent: Friday, January 16, 2009 9:30 AM
Subject: Re: [R] Value Lookup from File without Slurping


 you might try to iteratively read a limited number of line of lines in a
 batch using readLines:

 # filename, the name of your file
 # n, the maximal count of lines to read in a batch
 connection = file(filename, open=rt)
 while (length(lines - readLines(con=connection, n=n))) {
   # do your stuff here
 }
 close(connection)

 ?file
 ?readLines

 vQ


 Gundala Viswanath wrote:
 Dear all,

 I have a repository file (let's call it repo.txt)
  that contain two columns like this:

 # tag  value
 AAA0.2
 AAT0.3
 AAC   0.02
 AAG   0.02
 ATA0.3
 ATT   0.7

 Given another query vector


 qr - c(AAC, ATT)


 I would like to find the corresponding value for each query above,
 yielding:

 0.02
 0.7

 However, I want to avoid slurping whole repo.txt into an object (e.g. 
 hash).
 Is there any ways to do that?

 The reason I want to do that because repo.txt is very2 large size
 (milions of lines,
 with tag length  30 bp),  and my PC memory is too small to keep it.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with extractPrediction in package caret

2009-01-16 Thread Uwe Ligges



Häring, Tim (LWF) wrote:

Hi list,

I´m working on a predictive modeling task using the caret package.
I found the best model parameters using the train() and trainControl() command. 
Now I want to evaluate my model and make predictions on a test dataset. I tried 
to follow the instructions in the manual and the vignettes but unfortunately 
I´m getting an error message I can`t figure out.
Here is my code:
rfControl - trainControl(method = oob, returnResamp = all, 
returnData=TRUE, verboseIter = TRUE)
rftrain - train(x=train_x, y=trainclass, method=rf, tuneGrid=tuneGrid, 
tr.control=rfControl)

pred - predict(rftrain) 
pred	# this works fine

expred - extractPrediction(rftrain)

Error in models[[1]]$trainingData : 
  $ operator is invalid for atomic vectors



I cannot reproduce it (not having your data) and I doubt you are using 
the most recent version which is 3.51.

Anyway, *if* it is a bug, then please report to the package maintainer.

Best,
Uwe Ligges



My predictors are 28 numeric attributes and one factor.
I`m working with the latest version of caret and R 2.7.2 on WinXP.

Any advice is very welcome.

Thanks.
TIM


--- 
Dipl.-Geogr. Tim Häring

Sachgebiet Standort und Bodenschutz (SG 2.1)
Bayerische Landesanstalt für Wald und Forstwirtschaft
Am Hochanger 11
D-85354 Freising

Tel.: +49-(0)8161/71-4769
E-Mail: tim.haer...@lwf.bayern.de
http://www.lwf.bayern.de




[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [r] How to Solve the Error( error:cannot allocate vector of size 1.1 Gb)

2009-01-16 Thread Uwe Ligges



Kum-Hoe Hwang wrote:

Hi, Gurus

Thanks to your good helps, I have managed starting the use of a text
mining package so called tm in R under the OS of Win XP.

However, during running the tm package, I got another mine like memory problem.

What is a the best way to solve this memory problem among increasing a
physical RAM, or doing other recipes, etc?



How can we know? We do not know anything about your problem. Maybe not 
even 64Gb are sufficient or maybe it is simplest to just use a huge 
machine with 16Gb


Uwe Ligges



###
## my R Script's Outputs ##
###


memory.limit(size = 2000)

NULL

corpus.ko - Corpus(DirSource(test_konews/),

+  readerControl = list(reader = readPlain,
+  language = UTF-8, load = FALSE))

corpus.ko.nowhite - tmMap(corpus.ko, stripWhitespace)
corpus - tmMap(corpus.ko.nowhite, tmTolower)
tdm - TermDocMatrix(corpus)
 findAssocs(tdm, city, 0.97)

error:cannot allocate vector of size 1.1 Gb
-

Thanks for your precious time,

--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : phdhw...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function return output

2009-01-16 Thread Uwe Ligges



threshold wrote:

Hi, I wrote the function which outputs a matrix 'c' and a single value 'd',
as follows (simplified example):
procedure - function(a,b){
...
list(c,d)
}
now I want to use 'c' and 'd' in code as follows:
d - matrix(0,1,1)
value - procedure(a,b)
and d[1,1] - value[2] breaks telling that:
Error in d[1, 1] : incorrect number of dimensions
What I did wrong??, best, robert


Probably value is a list, hence value[2] is also a list and you cannot 
assign a list to an element of a *numeric* matrix. I guess you want 
value[[2]].
Anyway, it is all a guess since we do not know what your procedure() 
returns 



Uwe Ligges










__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adressing list-elements

2009-01-16 Thread Uwe Ligges



Thomas Schwander wrote:

Dear all,
I'm using R 2.8.1 under Vista.

I programmed a Simulation with the code enclosed at the end of the eMail.

After the simulation I want to analyse the columns of the single 
simulation-runs, i.e. e.g. Simulation[[1]][,1] sth. like that but I 
cannot address these columns...



I guess at least line

 Simulation-list(input1)

can be omitted or at least replaced by some sensible list initialization?


I also guess that you intend to assign
  Simulation[[i]] - FP
rather than
  Simulation[[i]] - list(FP)

but untested without your data.

Uwe Ligges






Can anybody please help?

Best,
Thomas

 CODE 
analyse-read.csv2(C:\\Users\\Thomas\\Desktop\\PCA_Kohle_2007.csv,sep=;) 


rownames(analyse)-analyse[,1]
analyse-analyse[,-1]
require(tseries)
require(fSeries)
st_rendite-apply(apply(analyse,2,log),2,diff)*100

pca-princomp(st_rendite)
summary(pca)
summary(pca)$loadings
screeplot(pca,type=l,main=Screeplot der Prinzipalkomponenten)


loadings_pca1--summary(pca)$loadings[1:6]
loadings_pca2--summary(pca)$loadings[7:12]
loadings_pca3--summary(pca)$loadings[13:18]
loadings_pca4--summary(pca)$loadings[19:24]
loadings_pca5--summary(pca)$loadings[25:30]
loadings_pca6--summary(pca)$loadings[31:36]
loadmatrix-as.matrix(cbind(loadings_pca1,loadings_pca2,loadings_pca3,loadings_pca4,loadings_pca5,loadings_pca6),princomp,dim(analyse)[2]) 


eigen-summary(pca)$sdev

#Monte-Carlo-Simulation
trials-1
Step-1/365
princomp-3
input1-matrix(0,365,dim(analyse)[2])
Simulation-list(input1)
shocklist-list(input1)
Initialwerte-analyse[dim(analyse)[1],]
days-365

shockmatrix-matrix(0,365,princomp)
for(i in 1:trials){
if(i==1) now-Sys.time()
FP-matrix(0,365,dim(analyse)[2])
 for(k in 1:days){
 shocks-rnorm(princomp,0,1)
 shockmatrix[k,]-shocks
 #shocks-shocks_fak
 #  for(j in 1:1){
   for(j in 1:dim(analyse)[2]){
 if(k==1) {
   FP[k,j]-as.numeric(Initialwerte[j])*
   exp(-0.5*Step*sum((loadmatrix[j,1:princomp]*eigen[1:princomp])^2)
   
+(sqrt(Step)*sum(loadmatrix[j,1:princomp]*eigen[1:princomp]*shocks)))

   } else {
   FP[k,j]-FP[k-1,j]*
   exp(-0.5*Step*sum((loadmatrix[j,1:princomp]*eigen[1:princomp])^2)
   
+(sqrt(Step)*sum(loadmatrix[j,1:princomp]*eigen[1:princomp]*shocks)))

   }
   }
 }
 Simulation[[i]]-list(FP)
 shocklist[[i]]-list(shockmatrix)
}
endzeit-Sys.time()
Berechnungsdauer-endzeit-now
Berechnungsdauer

Simulation[[1]]
shocklist[[1]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Wacek Kusnierczyk
r...@quantide.com wrote:

 Using file() is not a real reading of all the file. This function will
 simply open a connection to the file without reading it.
 countLines should do something lile wc -l from a bash shell


just for a test:

cat(rep('', 10^7), file='test.txt', fill=1)
library(R.utils)
system.time(countLines('test.txt'))

... and the file is just about 30MB (and it makes no real difference if
it is stuffed with newlines or not).

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Wacek Kusnierczyk
r...@quantide.com wrote:
 I agree on the database solution.
 Database are the rigth tool to solve this kind of problem.
 Only consider the start up cost of setting up the database. This could
 be a very time consuming task if someone is not familiar with database
 technology.

and won't pay if you want to do the lookup just once.


 Using file() is not a real reading of all the file. This function will
 simply open a connection to the file without reading it.
 countLines should do something lile wc -l from a bash shell

... and wc knows the count of lines in a file without reading it

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a chromosome location map by locus ID

2009-01-16 Thread Neil Shephard



Sake wrote:
 
 Hi,
 
 I'm trying to make a chromosomal map in R by using the locus. I have a
 list of genes and their locus, and I want to visualise that so you can see
 if there are multiple genes on a specific place on a chromosome. A example
 of what I more or less want is below:
  http://www.nabble.com/file/p21474206/untitled.JPG untitled.JPG 
 The genes and locus are here:
 http://www.nabble.com/file/p21474206/genlocus.csv genlocus.csv 
 I've tried some things, but nothing worked like I would like it to see.
 Maybe there is some kind of package that does this for you, but I did not
 find it yet.
 Thanx
 
 Sake
 

Whats wrong with things like the HapMap Genome Browser that allows you to
zoom in and out and to produce customised annotations of chromosomal regions
at varying resolutions (see http://www.hapmap.org/)?  Of course I'm assuming
that you are looking at human chromosomes ;-) If not
then perhaps the UCSC Genome Browser may be of use as it has a large number
genomes you can browse (see http://genome.ucsc.edu/cgi-bin/hgGateway ).

If you really want to do this in R You might get some mileage out of the
lodplot package which can draw ideograms (which is what a schematic of a
choromsome with bandings from different stainings is called), although the
dataset available for it is again for human chromosomes (see
http://cran.r-project.org/web/packages/lodplot/index.html ).

Perhaps worth checking out the Genetics Task View too thats linked from
CRAN.

Neil


-- 
View this message in context: 
http://www.nabble.com/How-to-create-a-chromosome-location-map-by-locus-ID-tp21474206p21497479.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a chromosome location map by locus ID

2009-01-16 Thread Sake



Neil Shephard wrote:
 
 
 
 Whats wrong with things like the HapMap Genome Browser that allows you to
 zoom in and out and to produce customised annotations of chromosomal
 regions at varying resolutions (see http://www.hapmap.org/)?  Of course
 I'm assuming that you are looking at human chromosomes ;-) If not
 then perhaps the UCSC Genome Browser may be of use as it has a large
 number genomes you can browse (see
 http://genome.ucsc.edu/cgi-bin/hgGateway ).
 
 If you really want to do this in R You might get some mileage out of the
 lodplot package which can draw ideograms (which is what a schematic of a
 choromsome with bandings from different stainings is called), although the
 dataset available for it is again for human chromosomes (see
 http://cran.r-project.org/web/packages/lodplot/index.html ).
 
 Perhaps worth checking out the Genetics Task View too thats linked from
 CRAN.
 
 Neil
 
 
 

I'm well known with all the tools on the internet which allow you to find
the position of genes on a chromosome. The only thing is, none of them has
the function to upload a list of e.g. 300 genes. I have a list of over
expressed genes, and I want to know on which chromosome they are so I can
see if there is some kind of link between the genes and the position on a
chromosome. I already have made a list of the locus of each gene, but now I
want to make some sort of plot that allows me to visualise where the genes
are located. So the reason I don't use those webtools is because I have 300
genes and I'm not planning to search for each gene individual.
The lodplot package looks promising (already found it;-), but thanx
anyway!), but I did not yet figured out how to use it properly. I've not
found any tutorial or example data to test it.
-- 
View this message in context: 
http://www.nabble.com/How-to-create-a-chromosome-location-map-by-locus-ID-tp21474206p21497719.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PDF slided (beamer or prosper) to an editable PPT

2009-01-16 Thread Neil Shephard



zubin-2 wrote:
 
 Hello, I am getting requests to place our PDF slides (output from 
 beamer) into Microsoft Powerpoint formats (.ppt).  What's the best 
 practice or any recommended software packages (any success with open or 
 commercial) that we can use to convert PDF slides into an EDITABLE 
 powerpoint deck? 
 
 

Plenty of suggestions out there...
http://www.google.co.uk/search?q=pdf+to+powerpoint+converter

Never used any of them though, so I can't comment on which is the best or
recommend one over the other.

Personally though I don't see the need for them to be editable beyond
someone wanting to plagirise the work.  

Neil
-- 
View this message in context: 
http://www.nabble.com/PDF-slided-%28beamer-or-prosper%29-to-an-editable-PPT-tp21491357p21497540.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Predictions with GAM

2009-01-16 Thread Robbert Langenberg
Dear,

I am trying to get a prediction of my GAM on a response type. So that I
eventually get plots with the correct values on my ylab.
I have been able to get some of my GAM's working with the example shown
below:
*
model1-gam(nsdall ~ s(jdaylitr2), data=datansd)
newd1 - data.frame(jdaylitr2=(244:304))
pred1 - predict.gam(model1,newd1,type=response)*

The problem I am encountering now is that I cannot seem to get it done for
the following type of model:

*model3-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*

My mapID consists of 8 levels of which I get individual plots with *
plot(model3)*. When I do predict with a newdata in it just like my first
model I need all columns to have the same amount of rows or else R will not
except it ofcourse, the col.names need to at least include day and mapID.
This way I can not get a prediction working for this GAM, I am confused
because of this part in the model: *s(day,by=mapID).

*I have been reading through the GAM, an introduction with R book from Wood,
S. but could not find anything about predictions with BY in the model.

I hope someone can help me out with this,

Sincerely yours,

Robbert Langenberg

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Smooth periodic splines

2009-01-16 Thread Duncan Murdoch

cmr.p...@gmail.com wrote:

Hello group!

Is there a package that allows to fit smooth *periodic* splines to
data? I'm interested in a function which combines the functionality of
smooth.spline and splines::periodicSpline.
  


I don't know one, but you could use the same technique that 
periodicSpline uses:  repeat a copy of the data to the left and right of 
the main copy (similarly replicating the knots if you want regression 
splines rather than smoothing splines), then fit to the augmented 
dataset.  I don't think it is guaranteed to be exactly periodic, but it 
will be very close.


There is also the periodic option to splinefun and you might be able 
to use it to construct a true periodic basis, but you'll have to work 
out some tricky details to get that right.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Gabor Grothendieck
On Fri, Jan 16, 2009 at 5:52 AM, r...@quantide.com r...@quantide.com wrote:
 I agree on the database solution.
 Database are the rigth tool to solve this kind of problem.
 Only consider the start up cost of setting up the database. This could be a
 very time consuming task if someone is not familiar with database
 technology.

Using sqldf as mentioned previously on this thread allows one to use
the SQLite database with no setup at all.  sqldf automatically creates
the database, generates the record layout, loads the file (not going through
R but outside of R so R does not slow it down) and extracts the
portion you want into R issuing the appropriate calls to RSQLite/DBI and
destroying the database afterwards all automatically.  When you
install sqldf it automatically installs RSQLite and the SQLite database
itself so the entire installation is just one line: install.packages(sqldf)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] re name vector

2009-01-16 Thread canadiangirl19

I´m a really R beginer,

so I have a simply question.

I would like to rename a vector in a loop.

I´d like to have as output:

vector1-whatever
vector2-whatever
vector3-whatever
etc..

so I thought it´s easily
for (s in c(1:3)){
vectorn- whatever
}

but I´m getting an error.

cheers
-- 
View this message in context: 
http://www.nabble.com/rename-vector-tp21497882p21497882.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re name vector

2009-01-16 Thread Gabor Grothendieck
Its a FAQ:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f

On Fri, Jan 16, 2009 at 6:58 AM, canadiangirl19 canadiangir...@sms.at wrote:

 I´m a really R beginer,

 so I have a simply question.

 I would like to rename a vector in a loop.

 I´d like to have as output:

 vector1-whatever
 vector2-whatever
 vector3-whatever
 etc..

 so I thought it´s easily
 for (s in c(1:3)){
 vectorn- whatever
 }

 but I´m getting an error.

 cheers
 --
 View this message in context: 
 http://www.nabble.com/rename-vector-tp21497882p21497882.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data from Excel Spread sheet

2009-01-16 Thread John Sorkin
Kiran,
One, not very elegant way, to solve your problem is to first save the
Excel spreadsheet as a CSV file (open the Excel file in Excel and the
use file-save as CSV, i.e. xxx.CSV) and then use read.csv()
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

 venkata kirankumar kiran4u2...@gmail.com 1/16/2009 4:32 AM 
Hi all,

I tried to read data from Excel spread sheet with using

read.csv(file.choose())
and
read.delim(file.choose())
but its showing *ÐÏ.à.*.

and also i tried with
read.table(file.choose())

then its showing*  V1
1 ÐÏ\021ࡱ*   


can any one suggest how to read data from Excel Spread sheet

thanks  regards;

kiran

[[alternative HTML version deleted]]


Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predictions with GAM

2009-01-16 Thread Gavin Simpson
On Fri, 2009-01-16 at 12:36 +0100, Robbert Langenberg wrote:
 Dear,
 
 I am trying to get a prediction of my GAM on a response type. So that I
 eventually get plots with the correct values on my ylab.
 I have been able to get some of my GAM's working with the example shown
 below:
 *
 model1-gam(nsdall ~ s(jdaylitr2), data=datansd)
 newd1 - data.frame(jdaylitr2=(244:304))
 pred1 - predict.gam(model1,newd1,type=response)*

Hi Robert,

You want predictions for the covariate over range 244:304 for each of
your 8 mapID's, yes?

This is not tested, but why not something like:

newd2 - data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(levels(datansd$mapID), each = 100))

Then use newd2 in your call to predict.

I am assuming that datansd$mapID is a factor in the above. If it is just
some other indicator variable, then perhaps something like:

newd2 - data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(sort(unique(datansd$mapID)), 
each = 100))

Does that work for you?

HTH

G

 
 The problem I am encountering now is that I cannot seem to get it done for
 the following type of model:
 
 *model3-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*
 
 My mapID consists of 8 levels of which I get individual plots with *
 plot(model3)*. When I do predict with a newdata in it just like my first
 model I need all columns to have the same amount of rows or else R will not
 except it ofcourse, the col.names need to at least include day and mapID.
 This way I can not get a prediction working for this GAM, I am confused
 because of this part in the model: *s(day,by=mapID).
 
 *I have been reading through the GAM, an introduction with R book from Wood,
 S. but could not find anything about predictions with BY in the model.
 
 I hope someone can help me out with this,
 
 Sincerely yours,
 
 Robbert Langenberg
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



signature.asc
Description: This is a digitally signed message part
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Efficiency challenge: MANY subsets

2009-01-16 Thread Johannes Graumann
Hello,

I have a list of character vectors like this:

sequences - list(
  c(M,G,L,W,I,S,F,G,T,P,P,S,Y,T,Y,L,L,I,M,
  N,H,K,L,L,L,I,N,N,N,N,L,T,E,V,H,T,Y,F,
  N,I,N,I,N,I,D,K,M,Y,I,H,*)
)

and another list of subset ranges like this:

indexes - list(
  list(
c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51)
  )
)

What I now want to do is to subset each entry in sequences 
(sequences[[1]]) with all ranges in the corresponding low level list in 
indexes (indexes[[1]]). Here is what I came up with.

fragments - list()
for(iN in seq(length(sequences))){
  cat(paste(iN,\n))
  tmpFragments - sapply(
indexes[[iN]],
function(x){
  sequences[[iN]][seq.int(x[1],x[2])]
}
  )
  fragments[[iN]] - tmpFragments
}

This works fine, but sequences contains thousands of entries and the 
corresponding indexes are sometimes hundreds of ranges long, so this whole 
process is EXTREMELY inefficient.

Does somebody out there take the challenge and show me a way on how to speed 
this up?

Thanks for any hints,

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function return output

2009-01-16 Thread David Winsemius


On Jan 16, 2009, at 5:22 AM, threshold wrote:



Hi, I wrote the function which outputs a matrix 'c' and a single  
value 'd',

as follows (simplified example):
procedure - function(a,b){
...
list(c,d)
}
now I want to use 'c' and 'd' in code as follows:
d - matrix(0,1,1)
value - procedure(a,b)
and d[1,1] - value[2] breaks telling that:
Error in d[1, 1] : incorrect number of dimensions
What I did wrong??, best, robert


Who knows? No reproducible code.

The usual way of accessing just the second element of a list would be  
value[[2]]. value[2] will be a list that contains whatever was d,  
rather than what was d in the procedure() call itself. It will not  
have the name d. It is possible to use the same names inside and  
outside a function but it may be more clear for beginners to keep them  
separate.


--
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frames with å, ä, and ö (=n on-ASCII-characters) from windows to mac os x

2009-01-16 Thread Gustaf Rydevik
Hi,
I ran into this issue previously and managed to solve it, but I've
forgotten how and am getting frustrated...

I have a data frame (see below) with scandinavian characters in R
(2.7.1) running on a Win Xp-computer. I save the data frame in an
RData-file on a usb stick, and load() it in R (2.8.0) running on OS X
10.5. Now the name of the data frame and all factor labels with
scandinavian characters are scrambled. How do I make R in OS X read my
data frame?
From what I've managed to find in the list archives and the FAQ I either
1) run
 Sys.setlocale(LC_ALL,en_US.UTF-8) ### Doesn't change anything
or
2) run
  defaults write org.R-project.R force.LANG en_US.UTF-8
in the terminal, which doesn't help either.
I must admit that I couldn't quite follow what documentation i found
on locales, so I might have messed up somewhere along the line.

Many thanks in advance for your help!

Regards,

Gustaf




Länkarta -
structure(list(LANKOD = structure(c(11L, 19L, 10L, 13L, 21L,
7L, 9L, 18L, 8L, 3L, 16L, 6L, 5L, 4L, 15L, 2L, 20L, 17L, 1L,
14L, 12L), .Label = c(AB, AC, BD, C, D, E, F, G,
H, I, K, M, N, O, S, T, U, W, X, Y, Z
), class = factor), Län = structure(c(1L, 4L, 3L, 5L, 6L, 7L,
8L, 2L, 9L, 10L, 20L, 21L, 13L, 14L, 15L, 16L, 17L, 18L, 12L,
19L, 11L), .Label = c(Blekinge län, Dalarnas län, Gotlands län,
Gävleborgs län, Hallands län, Jämtlands län, Jönköpings län,
Kalmar län, Kronobergs län, Norrbottens län, Skåne län,
Stockholms län, Södermanlands län, Uppsala län, Värmlands län,
Västerbottens län, Västernorrlands län, Västmanlands län,
Västra Götalands län, Örebro län, Östergötlands län), class =
factor)), .Names = c(LANKOD,
Län), class = data.frame, row.names = c(0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20))



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficiency challenge: MANY subsets

2009-01-16 Thread Jorge Ivan Velez
Dear Johannes,
Try this:


sequences - c(M,G,L,W,I,S,F,G,T,P,P,S,Y,T,
Y,L,L,I,M,N,H,K,L,L,L,I,N,N,N,N,L,T,E,V,
H,T,Y,F,N,I,N,I,N,I,D,K,M,Y,I,H,*)

indexes - matrix(c(1,22,22,46,46,51,1,46,22,51,1,51),ncol=2,byrow=TRUE)

apply(indexes,1,function(x){
  ind- x[1]:x[2]
  sequences[ind]
  }
  )


HTH,

Jorge



On Fri, Jan 16, 2009 at 8:06 AM, Johannes Graumann johannes_graum...@web.de
 wrote:

 Hello,

 I have a list of character vectors like this:

 sequences - list(

  
 c(M,G,L,W,I,S,F,G,T,P,P,S,Y,T,Y,L,L,I,M,

  N,H,K,L,L,L,I,N,N,N,N,L,T,E,V,H,T,Y,F,
  N,I,N,I,N,I,D,K,M,Y,I,H,*)
 )

 and another list of subset ranges like this:

 indexes - list(
  list(
c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51)
  )
 )

 What I now want to do is to subset each entry in sequences
 (sequences[[1]]) with all ranges in the corresponding low level list in
 indexes (indexes[[1]]). Here is what I came up with.

 fragments - list()
 for(iN in seq(length(sequences))){
  cat(paste(iN,\n))
  tmpFragments - sapply(
indexes[[iN]],
function(x){
  sequences[[iN]][seq.int(x[1],x[2])]
}
  )
  fragments[[iN]] - tmpFragments
 }

 This works fine, but sequences contains thousands of entries and the
 corresponding indexes are sometimes hundreds of ranges long, so this
 whole
 process is EXTREMELY inefficient.

 Does somebody out there take the challenge and show me a way on how to
 speed
 this up?

 Thanks for any hints,

 Joh

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficiency challenge: MANY subsets

2009-01-16 Thread Henrique Dallazuanna
Try this:

lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, as.list(g))])

On Fri, Jan 16, 2009 at 11:06 AM, Johannes Graumann 
johannes_graum...@web.de wrote:

 Hello,

 I have a list of character vectors like this:

 sequences - list(

  
 c(M,G,L,W,I,S,F,G,T,P,P,S,Y,T,Y,L,L,I,M,

  N,H,K,L,L,L,I,N,N,N,N,L,T,E,V,H,T,Y,F,
  N,I,N,I,N,I,D,K,M,Y,I,H,*)
 )

 and another list of subset ranges like this:

 indexes - list(
  list(
c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51)
  )
 )

 What I now want to do is to subset each entry in sequences
 (sequences[[1]]) with all ranges in the corresponding low level list in
 indexes (indexes[[1]]). Here is what I came up with.

 fragments - list()
 for(iN in seq(length(sequences))){
  cat(paste(iN,\n))
  tmpFragments - sapply(
indexes[[iN]],
function(x){
  sequences[[iN]][seq.int(x[1],x[2])]
}
  )
  fragments[[iN]] - tmpFragments
 }

 This works fine, but sequences contains thousands of entries and the
 corresponding indexes are sometimes hundreds of ranges long, so this
 whole
 process is EXTREMELY inefficient.

 Does somebody out there take the challenge and show me a way on how to
 speed
 this up?

 Thanks for any hints,

 Joh

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-SIG-Mac] data frames with å, ä , and ö (=non-ASCII-characters) from win dows to mac os x

2009-01-16 Thread Prof Brian Ripley
You need to use CP1252 not UTF-8 to read the data.  It tells you how 
to do so on the help page ... under 'encoding'. So something like


  A - read.table(con - file(myfile, encoding=CP1252));close(con)

Please don't cross-post ... I am being brief because you did.

On Fri, 16 Jan 2009, Gustaf Rydevik wrote:


Hi,
I ran into this issue previously and managed to solve it, but I've
forgotten how and am getting frustrated...

I have a data frame (see below) with scandinavian characters in R
(2.7.1) running on a Win Xp-computer. I save the data frame in an
RData-file on a usb stick, and load() it in R (2.8.0) running on OS X
10.5. Now the name of the data frame and all factor labels with
scandinavian characters are scrambled. How do I make R in OS X read my
data frame?

From what I've managed to find in the list archives and the FAQ I either

1) run
Sys.setlocale(LC_ALL,en_US.UTF-8) ### Doesn't change anything
or
2) run
 defaults write org.R-project.R force.LANG en_US.UTF-8
in the terminal, which doesn't help either.
I must admit that I couldn't quite follow what documentation i found
on locales, so I might have messed up somewhere along the line.

Many thanks in advance for your help!

Regards,

Gustaf




Länkarta -
structure(list(LANKOD = structure(c(11L, 19L, 10L, 13L, 21L,
7L, 9L, 18L, 8L, 3L, 16L, 6L, 5L, 4L, 15L, 2L, 20L, 17L, 1L,
14L, 12L), .Label = c(AB, AC, BD, C, D, E, F, G,
H, I, K, M, N, O, S, T, U, W, X, Y, Z
), class = factor), Län = structure(c(1L, 4L, 3L, 5L, 6L, 7L,
8L, 2L, 9L, 10L, 20L, 21L, 13L, 14L, 15L, 16L, 17L, 18L, 12L,
19L, 11L), .Label = c(Blekinge län, Dalarnas län, Gotlands län,
Gävleborgs län, Hallands län, Jämtlands län, Jönköpings län,
Kalmar län, Kronobergs län, Norrbottens län, Skåne län,
Stockholms län, Södermanlands län, Uppsala län, Värmlands län,
Västerbottens län, Västernorrlands län, Västmanlands län,
Västra Götalands län, Örebro län, Östergötlands län), class =
factor)), .Names = c(LANKOD,
Län), class = data.frame, row.names = c(0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20))

--
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] basic boxplot questions

2009-01-16 Thread ivo welch
dear R experts:

I am playing with boxplots for the first time.  most of it is
intuitive, although there was less info on the web than I had hoped.

alas, for some odd reason, my R boxplots have some fat black dots, not
just the hollow outlier plots.  Is there a description of when R draws
hollow vs. fat dots somewhere?

[and what is the parameter to change just the size of these dots?]

Also, let me show my fundamental ignorance:  I am a little surprised
that the average box boxplot would not show the mean and sdv, too, at
least optionally.  Is there a common way to accomplish this (e.g., in
a different color), or do I just construct it myself with standard R
graphics line() commands?

advice appreciated.

regards,

/iaw

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predictions with GAM

2009-01-16 Thread Robbert Langenberg
Thanks for the swift reply,

I might have been a bit sloppy with describing my datasets and problem. I
showed the first model as an example of the type of GAM that I had been able
to use the predict function on. What I am looking for is how to predict my
m3:
model3-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)

When I plot this I get 8 different graphs. Each showing me a different
habitat type with on the x-axis days and on the y-axis s(day,2,81):mapID.
With predict I was hoping to get the scale of the y-axis right for a
selection of days (for example 244,304).

I have tried to reform the script you gave me to match my dataset in m3, but
it all did not seem to work.

newd2 - data.frame(day = rep(seq(244, 304, length = 100), 8),
   mapID = rep(levels(mergeday$mapID), each = 100))

newd2 - data.frame(day = rep(seq(244, 304, length = 100), 8),
   mapID = rep(sort(unique(mergeday$mapID)),
   each = 100))

I am guessing it must have something to do with the  by in s(day,by=mapID).
I haven't come across any examples that used a GAM with by and then used the
predict function.

(A sample of the dataset:
  mapID  day   y_no
Urban Areas and Water25  1
Urban Areas and Water26  1
Early Succesional Forest  27  0
Agriculture   28  0
Early Succesional Forest  29  0
Mature Coniferous Forest  30  0)


I am sorry that I have to bother you even more with this, and I hope that my
additional explanation about my problem might help solve it.

Sincerely yours,

Robbert Langenberg

2009/1/16 Gavin Simpson gavin.simp...@ucl.ac.uk

 On Fri, 2009-01-16 at 12:36 +0100, Robbert Langenberg wrote:
  Dear,
 
  I am trying to get a prediction of my GAM on a response type. So that I
  eventually get plots with the correct values on my ylab.
  I have been able to get some of my GAM's working with the example shown
  below:
  *
  model1-gam(nsdall ~ s(jdaylitr2), data=datansd)
  newd1 - data.frame(jdaylitr2=(244:304))
  pred1 - predict.gam(model1,newd1,type=response)*

 Hi Robert,

 You want predictions for the covariate over range 244:304 for each of
 your 8 mapID's, yes?

 This is not tested, but why not something like:

 newd2 - data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(levels(datansd$mapID), each = 100))

 Then use newd2 in your call to predict.

 I am assuming that datansd$mapID is a factor in the above. If it is just
 some other indicator variable, then perhaps something like:

 newd2 - data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(sort(unique(datansd$mapID)),
each = 100))

 Does that work for you?

 HTH

 G

 
  The problem I am encountering now is that I cannot seem to get it done
 for
  the following type of model:
 
  *model3-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*
 
  My mapID consists of 8 levels of which I get individual plots with *
  plot(model3)*. When I do predict with a newdata in it just like my first
  model I need all columns to have the same amount of rows or else R will
 not
  except it ofcourse, the col.names need to at least include day and mapID.
  This way I can not get a prediction working for this GAM, I am confused
  because of this part in the model: *s(day,by=mapID).
 
  *I have been reading through the GAM, an introduction with R book from
 Wood,
  S. but could not find anything about predictions with BY in the model.
 
  I hope someone can help me out with this,
 
  Sincerely yours,
 
  Robbert Langenberg
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] 
 http://www.ucl.ac.uk/~ucfagls/http://www.ucl.ac.uk/%7Eucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




-- 
www.lowlandpaddies.nl

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data from Excel Spread sheet

2009-01-16 Thread Pedro Mardones
or maybe by using the xlsReadWrite package:

mydata - read.xls(mydata.xls, sheet = 'Sheet1)


On Fri, Jan 16, 2009 at 4:32 AM, venkata kirankumar
kiran4u2...@gmail.com wrote:
 Hi all,

 I tried to read data from Excel spread sheet with using

 read.csv(file.choose())
 and
 read.delim(file.choose())
 but its showing *ÐÏ.à.*.

 and also i tried with
 read.table(file.choose())

 then its showing*  V1
1 ÐÏ\021ࡱ*   


 can any one suggest how to read data from Excel Spread sheet

 thanks  regards;

 kiran

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] basic boxplot questions

2009-01-16 Thread K. Elo
ivo welch kirjoitti:
 dear R experts:
 
 I am playing with boxplots for the first time.  most of it is
 intuitive, although there was less info on the web than I had hoped.
 
 alas, for some odd reason, my R boxplots have some fat black dots, not
 just the hollow outlier plots.  Is there a description of when R draws
 hollow vs. fat dots somewhere?
 
 [and what is the parameter to change just the size of these dots?]
 
 Also, let me show my fundamental ignorance:  I am a little surprised
 that the average box boxplot would not show the mean and sdv, too, at
 least optionally.  Is there a common way to accomplish this (e.g., in
 a different color), or do I just construct it myself with standard R
 graphics line() commands?
 
 advice appreciated.
 
 regards,
 
 /iaw
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] basic boxplot questions

2009-01-16 Thread K. Elo
Hi Ivo,

ivo welch wrote:
 alas, for some odd reason, my R boxplots have some fat black dots, not
 just the hollow outlier plots.  Is there a description of when R draws
 hollow vs. fat dots somewhere?
 [and what is the parameter to change just the size of these dots?]

Have you tried the command '?boxplot' already? It should help you to
understand the syntax.

 Also, let me show my fundamental ignorance:  I am a little surprised
 that the average box boxplot would not show the mean and sdv, too, at
 least optionally.  Is there a common way to accomplish this (e.g., in
 a different color), or do I just construct it myself with standard R
 graphics line() commands?

Could you post the command(s) you have entered? Without a reproducible
example we are tapping in the dark.

Kind regards,
Kimmo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a chromosome location map by locus ID

2009-01-16 Thread Martin Morgan
Sake tlep.nav.e...@hccnet.nl writes:

 Neil Shephard wrote:
 
 
 
 Whats wrong with things like the HapMap Genome Browser that allows you to
 zoom in and out and to produce customised annotations of chromosomal
 regions at varying resolutions (see http://www.hapmap.org/)?  Of course
 I'm assuming that you are looking at human chromosomes ;-) If not
 then perhaps the UCSC Genome Browser may be of use as it has a large
 number genomes you can browse (see
 http://genome.ucsc.edu/cgi-bin/hgGateway ).
 
 If you really want to do this in R You might get some mileage out of the
 lodplot package which can draw ideograms (which is what a schematic of a
 choromsome with bandings from different stainings is called), although the
 dataset available for it is again for human chromosomes (see
 http://cran.r-project.org/web/packages/lodplot/index.html ).
 
 Perhaps worth checking out the Genetics Task View too thats linked from
 CRAN.
 
 Neil
 
 
 

 I'm well known with all the tools on the internet which allow you to find
 the position of genes on a chromosome. The only thing is, none of them has
 the function to upload a list of e.g. 300 genes. I have a list of over
 expressed genes, and I want to know on which chromosome they are so I can
 see if there is some kind of link between the genes and the position on a
 chromosome. I already have made a list of the locus of each gene, but now I
 want to make some sort of plot that allows me to visualise where the genes
 are located. So the reason I don't use those webtools is because I have 300
 genes and I'm not planning to search for each gene individual.

There are many tools in the R / Bioconductor project that address
these types of issues; a typical use case might use one of the 'org'
packages, e.g., org.Hs.eg.db though there are many others, to extract
information or to map between inforamtion types.

 library(org.Hs.eg.db)
 ls(2)
[snip]
 toTable(org.Hs.egCHRLOC[c('1000', '1')])
  gene_id start_location Chromosome
11000  -23784933 18
2   1 -241718157  1
3   1 -241733106  1
 toTable(org.Hs.egSYMBOL[c('1000', '1')])
  gene_id symbol
11000   CDH2
2   1   AKT3

There are a number of packages for displaying this information, but
usually in conjunction with additional covariates.  GenomeGraphs
provides really pretty pictures (though is more for detailed
presentation of individual genes). rtracklayer is an interface that
lets you lay and navigate tracks on web-based genome browsers.

The place to start with Bioconductor is http://bioconductor.org, e.g.,

  basic install: http://bioconductor.org/docs/install/
  package list: http://bioconductor.org/packages/release/Software.html

 source('http://bioconductor.org/biocLite.R')
 biocLite() # default packages
 biocLite('org.Hs.eg.db') #  specific package
 library(org.Hs.eg.db)

Look to the AnnotationDbi 'vignettes', either on-line (link to the
AnnotationDbi package page from the list above) or in the package
itself (via openVignettes()).

Any follow-up questions about Bioconductor should go to the
Bioconductor mailing list

  http://bioconductor.org/docs/mailList.html

Martin


 The lodplot package looks promising (already found it;-), but thanx
 anyway!), but I did not yet figured out how to use it properly. I've not
 found any tutorial or example data to test it.
 -- 
 View this message in context: 
 http://www.nabble.com/How-to-create-a-chromosome-location-map-by-locus-ID-tp21474206p21497719.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predictions with GAM

2009-01-16 Thread Ken Knoblauch
Hi,

Robbert Langenberg mcrelay at gmail.com writes:
 I am trying to get a prediction of my GAM on a response type. So that I
 eventually get plots with the correct values on my ylab.
 The problem I am encountering now is that I cannot seem to get it done for
 the following type of model:
 
 *model3-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*
 
 My mapID consists of 8 levels of which I get individual plots with *
 plot(model3)*. When I do predict with a newdata in it just like my first
 model I need all columns to have the same amount of rows or else R will not
 except it ofcourse, the col.names need to at least include day and mapID.
 This way I can not get a prediction working for this GAM, I am confused
 because of this part in the model: *s(day,by=mapID).
 
 I hope someone can help me out with this,
 
 Sincerely yours,
 
 Robbert Langenberg
I'm not sure that this will work for you, but I had a similar
situation and was able to get predict to work (after helpful
advice from Simon Wood) with a by variable by generating 
a model matrix for a model with the interaction of the covariate 
and the by term, something like

model.matrix(~ day:mapID - 1, data = mergeday)

in your case.
I added the appropriate columns into my data frame and 
also to the newdata for predict.  You can see an example 
in the appendix of

http://www.journalofvision.org/8/16/10/

HTH,

Ken

-- 
Ken Knoblauch
Inserm U846
Institut Cellule Souche et Cerveau
Département Neurosciences Intégratives
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice: how to have multiple wireframe nice intersection?

2009-01-16 Thread Guillaume Chapron

Hello,

This code builds a simple example of 2 wireframes :

require(lattice)
x - c(1:10)
y - c(1:10)
g - expand.grid(x = 1:10, y = 1:10, gr = 1:2)
g$z - c(as.vector(outer(x,y,*)), rep(50,100))
wireframe(z ~ x * y, data = g, groups = gr, scales = list(arrows =  
FALSE))


However, the intersection between the wireframes is not properly  
drawn. Is there a way to fix this with lattice, or should I use   
another package more suitable for this?


Thanks!

Guillaume

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: basic boxplot questions

2009-01-16 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 16.01.2009 15:24:26:

 dear R experts:
 
 I am playing with boxplots for the first time.  most of it is
 intuitive, although there was less info on the web than I had hoped.
 
 alas, for some odd reason, my R boxplots have some fat black dots, not
 just the hollow outlier plots.  Is there a description of when R draws
 hollow vs. fat dots somewhere?

Standard use of boxplot does not draw filled dots, just hollow points. 
Didnt you by chance use pch=19. 


 
 [and what is the parameter to change just the size of these dots?]

Details on bxp help page

outlty, outlwd, outpch, outcex, outcol, outbg: 
outlier line type, line width, point character, point size expansion, 
color, and background color. The default outlty= blank suppresses the 
lines and outpch=NA suppresses points. 

 
 Also, let me show my fundamental ignorance:  I am a little surprised
 that the average box boxplot would not show the mean and sdv, too, at
 least optionally.  Is there a common way to accomplish this (e.g., in
 a different color), or do I just construct it myself with standard R
 graphics line() commands?

I always expect boxplot to show median and IQR, however I have seen such 
twisted boxplots elsewhere. Eg if you looked in CRAN search facility you 
would easily find

R-help archive January 2004: Re: [R] adding mean to boxplot

Regards
Petr


 
 advice appreciated.
 
 regards,
 
 /iaw
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frames with å, ä, and ö (=n on-ASCII-characters) from windows to mac os x

2009-01-16 Thread Ivan Alves

Hi,

On my system (see below), it works fine (inputing the code below at  
the R prompt).  Make sure that the encoding of the input file is  
encoded UTF-8.


Rgds,

Ivan

 sessionInfo()
R version 2.8.1 Patched (2009-01-14 r47602)
i386-apple-darwin9.6.0

locale:
en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
 structure(list(LANKOD = structure(c(11L, 19L, 10L, 13L, 21L,7L, 9L,  
18L, 8L, 3L, 16L, 6L, 5L, 4L, 15L, 2L, 20L, 17L, 1L,14L, 12L), .Label  
= c(AB, AC, BD, C, D, E, F, G,H, I, K, M, N,  
O, S, T, U, W, X, Y, Z), class = factor), Län =  
structure(c(1L, 4L, 3L, 5L, 6L, 7L,8L, 2L, 9L, 10L, 20L, 21L, 13L,  
14L, 15L, 16L, 17L, 18L, 12L,19L, 11L), .Label = c(Blekinge län,  
Dalarnas län, Gotlands län,Gävleborgs län,Hallands län,  
Jämtlands län, Jönköpings län,Kalmar län, Kronobergs län,  
Norrbottens län, Skåne län,Stockholms län, Södermanlands län,  
Uppsala län, Värmlands län,Västerbottens län, Västernorrlands  
län, Västmanlands län,Västra Götalands län, Örebro län,  
Östergötlands län), class =factor)), .Names = c(LANKOD,Län),  
class = data.frame, row.names = c(0, 1, 2, 3,4, 5, 6,  
7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18,  
19, 20))

   LANKOD  Län
0   K Blekinge län
1   X   Gävleborgs län
2   I Gotlands län
3   N Hallands län
4   ZJämtlands län
5   F   Jönköpings län
6   H   Kalmar län
7   W Dalarnas län
8   G   Kronobergs län
9  BD  Norrbottens län
10  T   Örebro län
11  EÖstergötlands län
12  DSödermanlands län
13  C  Uppsala län
14  SVärmlands län
15 ACVästerbottens län
16  Y  Västernorrlands län
17  U Västmanlands län
18 AB   Stockholms län
19  O Västra Götalands län
20  MSkåne län
 Länkarta - structure(list(LANKOD = structure(c(11L, 19L, 10L, 13L,  
21L,7L, 9L, 18L, 8L, 3L, 16L, 6L, 5L, 4L, 15L, 2L, 20L, 17L, 1L,14L,  
12L), .Label = c(AB, AC, BD, C, D, E, F, G,H, I,  
K, M, N, O, S, T, U, W, X, Y, Z), class =  
factor), Län = structure(c(1L, 4L, 3L, 5L, 6L, 7L,8L, 2L, 9L, 10L,  
20L, 21L, 13L, 14L, 15L, 16L, 17L, 18L, 12L,19L, 11L), .Label =  
c(Blekinge län, Dalarnas län, Gotlands län,Gävleborgs  
län,Hallands län, Jämtlands län, Jönköpings län,Kalmar län,  
Kronobergs län, Norrbottens län, Skåne län,Stockholms län,  
Södermanlands län, Uppsala län, Värmlands län,Västerbottens  
län, Västernorrlands län, Västmanlands län,Västra Götalands  
län, Örebro län, Östergötlands län), class =factor)), .Names =  
c(LANKOD,Län), class = data.frame, row.names = c(0, 1, 2,  
3,4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,  
15,16, 17, 18, 19, 20))

 ls()
[1] Länkarta

On 16 Jan 2009, at 14:13, Gustaf Rydevik wrote:


Hi,
I ran into this issue previously and managed to solve it, but I've
forgotten how and am getting frustrated...

I have a data frame (see below) with scandinavian characters in R
(2.7.1) running on a Win Xp-computer. I save the data frame in an
RData-file on a usb stick, and load() it in R (2.8.0) running on OS X
10.5. Now the name of the data frame and all factor labels with
scandinavian characters are scrambled. How do I make R in OS X read my
data frame?
From what I've managed to find in the list archives and the FAQ I  
either

1) run
Sys.setlocale(LC_ALL,en_US.UTF-8) ### Doesn't change anything
or
2) run
 defaults write org.R-project.R force.LANG en_US.UTF-8
in the terminal, which doesn't help either.
I must admit that I couldn't quite follow what documentation i found
on locales, so I might have messed up somewhere along the line.

Many thanks in advance for your help!

Regards,

Gustaf




Länkarta -
structure(list(LANKOD = structure(c(11L, 19L, 10L, 13L, 21L,
7L, 9L, 18L, 8L, 3L, 16L, 6L, 5L, 4L, 15L, 2L, 20L, 17L, 1L,
14L, 12L), .Label = c(AB, AC, BD, C, D, E, F, G,
H, I, K, M, N, O, S, T, U, W, X, Y, Z
), class = factor), Län = structure(c(1L, 4L, 3L, 5L, 6L, 7L,
8L, 2L, 9L, 10L, 20L, 21L, 13L, 14L, 15L, 16L, 17L, 18L, 12L,
19L, 11L), .Label = c(Blekinge län, Dalarnas län, Gotlands län,
Gävleborgs län, Hallands län, Jämtlands län, Jönköpings län,
Kalmar län, Kronobergs län, Norrbottens län, Skåne län,
Stockholms län, Södermanlands län, Uppsala län, Värmlands län,
Västerbottens län, Västernorrlands län, Västmanlands län,
Västra Götalands län, Örebro län, Östergötlands län), class =
factor)), .Names = c(LANKOD,
Län), class = data.frame, row.names = c(0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20))



--
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide 

Re: [R] Fitting of lognormal distribution to lower tail experimental data

2009-01-16 Thread Mattias Brännström
Thank you, David!

I agree and apprechiate your analysis, which definitely will influence my
analysis of this data, but still I would like you to disregard from it(!)

The standard routine in the field is, beyond my control, to assume
lognormal distribution to achieve comparable results also with other
materials (comparison is made on COV). For that reason I have to use it,
even if it is not statistically defendable for this particular data.

So, if I rephrase the question to be (more general):
How would you fit a lognormal distribution to the lower 10% tail of the
data (assuming it was lognormal)? What functions to use?

Best regards,
Mattias
14.8104060497551
16.0803849713769
17.4968110450967
20.1117010953086
21.6845345194074
22.1964808750696
22.6873957765245
22.7739430649247
23.9264878434242
24.0222175448625
24.2589089305143
25.0916485959638
25.6616713982665
26.42372554061
27.1110230291713
27.5506997537485
27.7723436488626
28.5526342456366
28.5937291162256
28.9858602338711
29.0613508454069
29.0909808586811
29.2550164645902
29.2592989298793
29.6618509042981
29.9425395348144
30.2713842375835
30.3578677416218
30.3691588410138
30.3828682984843
30.4226905469563
30.5067979753883
30.5374335721223
30.5377759972106
30.5388417498134
30.6197685055474
30.7562765495333
30.859344040433
31.0055304357074
31.1385745478821
31.2613609610913
31.3581765472267
31.3786215131359
31.3862882622892
31.5134036026533
31.5518777553211
31.6665459342876
31.6966977981316
31.7570612273693
31.7921492877195
31.8916639064893
32.0060685276589
32.0175653961169
32.0957893864849
32.2300707106859
32.2332967187291
32.2933752614657
32.3576534230736
32.4077953094961
32.440737957141
32.4555840754853
32.5778411557745
32.8123917532014
32.8938018752775
32.9005162471121
33.07109861048
33.2169544508162
33.2528421050785
33.2538040913417
33.2629253254612
33.5137399745455
33.646430499
33.6635730620378
33.8081149852229
33.9678615607469
34.0229870935982
34.1334960727572
34.2686698621255
34.281848082171
34.2835695816904
34.285140119695
34.3103279775051
34.3231318200957
34.3298117057378
34.3580807993554
34.375123159033
34.4358460611865
34.6628865380987
34.6882398181505
34.7138670512661
34.8319635103069
34.8543564286965
34.8740064707112
34.8831091812053
35.0783682108864
35.1589202345914
35.1940609286179
35.1963400807163
35.2225030547279
35.2948450801441
35.469980224983
35.4834158010586
35.5066986045073
35.50715377932
35.5581064872091
35.5929578196659
35.6202992874957
35.8503591228778
35.8864712453884
35.9236676029501
36.0272009858509
36.0582893169656
36.0697828249579
36.0733810205568
36.0796120864424
36.1046602505238
36.2146409925046
36.2439056981677
36.2519350403298
36.322995147923
36.3325951679294
36.3998442944923
36.4825302220394
36.5356445443208
36.542777089046
36.5472978834097
36.6042330833268
36.6859461526364
36.7276360969522
36.7559303592816
36.7634229191255
36.763717634632
36.7941371199203
36.7963035045816
37.0158687639306
37.0348436474432
37.0682601112198
37.0753839210413
37.1072512130614
37.218572715278
37.2365135026929
37.308997089496
37.4740533868667
37.5061839013662
37.5920632597573
37.5985612515559
37.6442491468094
37.7885711256696
37.8031769603556
37.8044638170924
37.8758257907878
37.9492755772546
37.9684449085749
38.1925855742105
38.3276830837628
38.36856145824
38.425438354474
38.4871152468451
38.5850141197061
38.7021862984193
38.7281733035293
38.7906053310133
38.8167873845123
38.879602998118
38.9257736530842
39.1219500124099
39.126766061353
39.1572739123381
39.1854685896754
39.2465242986911
39.2681394600736
39.3020749787304
39.3491652552564
39.3519847700339
39.4481343103317
39.4641601623938
39.495558462879
39.5276929723755
39.5743412754252
39.641904265762
39.6805873230154
39.7328651773131
39.7425791919387
39.7862127456679
39.7862858916594
39.8029935150385
39.8184701961103
39.83254032456
39.860678129295
39.9503278266256
39.9590265687971
39.9757707281345
40.0615365250434
40.1029961842667
40.1057284653796
40.2691606972348
40.2757445303632
40.3168035819963
40.3293541033457
40.5323828990285
40.5968897591303
40.6751634982263
40.6988406405023
40.7068328640941
40.7675314086004
40.7776623368738
40.7908603255974
40.7955349906672
40.8035933414247
40.9383315440048
40.9558208833268
40.9606594573753
40.9784894747295
41.0046771495095
41.0177042749474
41.1282624024752
41.1359777151937
41.1980177115138
41.2528202892208
41.4075724424903
41.4116661876018
41.4467994208966
41.4552126570036
41.4568329047393
41.6507621039357
41.7203545510097
41.7436507425138
41.7725271219265
41.8136233981782
41.8338507732199
41.9178452756984
41.9378524875993
41.9424929587518
42.0160174497851
42.0327031030869
42.0695865194719
42.1050583647999
42.154742564893
42.2913255282883
42.2981875529339
42.3378903656232
42.40335129
42.437157423152
42.4731363693922
42.4809382643832
42.5203781472499
42.5249638381712
42.5444709444123
42.5741748422381
42.6845948827814
42.7094479620975
42.7756035736011
42.7789179566436
42.7914314897286
42.8199466233091
42.8302929470511
42.8867019728931

Re: [R] autocorrelation

2009-01-16 Thread Michael Denslow

 
 Hi
 Is any multiple regression-like test with correction for
 autocorrelation ?

If I understand your question, yes. Take a look at the spdep package for 
starters. Also you may find the following references helpful. 

Dormann et al. 2007. Methods to account for spatial autocorrelation in the 
analysis of species distributional data: a review. Ecography 30:609-628

Also the book by Bivand et al. 2008. (Applied Spatial Data Analysis with R. 
from Springer) is very good.

Hope this helps,

Michael Denslow

I.W. Carpenter Jr. Herbarium [BOON]
Appalachian State University
Boone, North Carolina U.S.A.

-- AND --

Communications Manager
Southeastern Regional Network of Expertise and Collections
sernec.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Gundala Viswanath
Hi Gabor,

Do you mean storing data in sqldf', doesn't take memory?
For example, I have 3GB data file. with standard R object using read.table()
the object size will explode twice ~6GB. My current 4GB RAM
cannot handle that.

Do you mean with sqldf, this is not the issue?
Why is that?

Sorry for my naive question.

- Gundala Viswanath
Jakarta - Indonesia



On Fri, Jan 16, 2009 at 9:09 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Fri, Jan 16, 2009 at 5:52 AM, r...@quantide.com r...@quantide.com wrote:
 I agree on the database solution.
 Database are the rigth tool to solve this kind of problem.
 Only consider the start up cost of setting up the database. This could be a
 very time consuming task if someone is not familiar with database
 technology.

 Using sqldf as mentioned previously on this thread allows one to use
 the SQLite database with no setup at all.  sqldf automatically creates
 the database, generates the record layout, loads the file (not going through
 R but outside of R so R does not slow it down) and extracts the
 portion you want into R issuing the appropriate calls to RSQLite/DBI and
 destroying the database afterwards all automatically.  When you
 install sqldf it automatically installs RSQLite and the SQLite database
 itself so the entire installation is just one line: install.packages(sqldf)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Gabor Grothendieck
Only the portion your extract is ever in R -- the file itself is read
into a database
without ever going through R so your memory requirements correspond to what
you extract, not the size of the file.

On Fri, Jan 16, 2009 at 10:49 AM, Gundala Viswanath gunda...@gmail.com wrote:
 Hi Gabor,

 Do you mean storing data in sqldf', doesn't take memory?
 For example, I have 3GB data file. with standard R object using read.table()
 the object size will explode twice ~6GB. My current 4GB RAM
 cannot handle that.

 Do you mean with sqldf, this is not the issue?
 Why is that?

 Sorry for my naive question.

 - Gundala Viswanath
 Jakarta - Indonesia



 On Fri, Jan 16, 2009 at 9:09 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Fri, Jan 16, 2009 at 5:52 AM, r...@quantide.com r...@quantide.com wrote:
 I agree on the database solution.
 Database are the rigth tool to solve this kind of problem.
 Only consider the start up cost of setting up the database. This could be a
 very time consuming task if someone is not familiar with database
 technology.

 Using sqldf as mentioned previously on this thread allows one to use
 the SQLite database with no setup at all.  sqldf automatically creates
 the database, generates the record layout, loads the file (not going through
 R but outside of R so R does not slow it down) and extracts the
 portion you want into R issuing the appropriate calls to RSQLite/DBI and
 destroying the database afterwards all automatically.  When you
 install sqldf it automatically installs RSQLite and the SQLite database
 itself so the entire installation is just one line: install.packages(sqldf)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Missing file to run Rcmd batch on Windows

2009-01-16 Thread Brigid Mooney
Hi,

I'm trying to run an R script using Rcmd Batch from the command line on a
Windows Vista machine.  I am using R version 2.8.1.

I installed the batch files 4-3 found at
http://cran.r-project.org/contrib/extra/batchfiles/ and added them to my
path.
I also had to install the latest version of perl (it's Strawberry perl if
that makes a difference) and have added this to my path.

Now when I run the command: Rcmd batch TestBatch.R TestOutput.txt from the
command line, I get the error:

Can't open perl script C:\Progra~1\R\R-28~1.0\bin\batch: No such file or
directory

Just for reference, TestBatch.R contains only one line: print(hello world)

Does anyone have any idea on what this file is that I might be missing?  Or
is there some other mistake I'm making in trying to run the a script from
the command line.

Thanks,
-Brigid

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing file to run Rcmd batch on Windows

2009-01-16 Thread Gabor Grothendieck
Try writing BATCH in upper case.


On Fri, Jan 16, 2009 at 10:51 AM, Brigid Mooney bkmoo...@gmail.com wrote:
 Hi,

 I'm trying to run an R script using Rcmd Batch from the command line on a
 Windows Vista machine.  I am using R version 2.8.1.

 I installed the batch files 4-3 found at
 http://cran.r-project.org/contrib/extra/batchfiles/ and added them to my
 path.
 I also had to install the latest version of perl (it's Strawberry perl if
 that makes a difference) and have added this to my path.

 Now when I run the command: Rcmd batch TestBatch.R TestOutput.txt from the
 command line, I get the error:

 Can't open perl script C:\Progra~1\R\R-28~1.0\bin\batch: No such file or
 directory

 Just for reference, TestBatch.R contains only one line: print(hello world)

 Does anyone have any idea on what this file is that I might be missing?  Or
 is there some other mistake I'm making in trying to run the a script from
 the command line.

 Thanks,
 -Brigid

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Gundala Viswanath
Hi,

 Unless you specify an in-memory database the database is stored on disk.

Thanks for your explanation.
I just downloaded 'sqldf'.

Where can I find the option for that? In sqldf I can't see the command.

I looked at:
envir = parent.frame()

doesn't appear to be the one.

- Gundala Viswanath
Jakarta - Indonesia


 On Fri, Jan 16, 2009 at 10:59 AM, Gundala Viswanath gunda...@gmail.com 
 wrote:
 Hi Gabor,

 the file itself is read  into a database

 The above doesn't use RAM memory?

 Rgds,
 GV.

 without ever going through R so your memory requirements correspond to what
 you extract, not the size of the file.

 On Fri, Jan 16, 2009 at 10:49 AM, Gundala Viswanath gunda...@gmail.com 
 wrote:
 Hi Gabor,

 Do you mean storing data in sqldf', doesn't take memory?
 For example, I have 3GB data file. with standard R object using 
 read.table()
 the object size will explode twice ~6GB. My current 4GB RAM
 cannot handle that.

 Do you mean with sqldf, this is not the issue?
 Why is that?

 Sorry for my naive question.

 - Gundala Viswanath
 Jakarta - Indonesia



 On Fri, Jan 16, 2009 at 9:09 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Fri, Jan 16, 2009 at 5:52 AM, r...@quantide.com r...@quantide.com 
 wrote:
 I agree on the database solution.
 Database are the rigth tool to solve this kind of problem.
 Only consider the start up cost of setting up the database. This could 
 be a
 very time consuming task if someone is not familiar with database
 technology.

 Using sqldf as mentioned previously on this thread allows one to use
 the SQLite database with no setup at all.  sqldf automatically creates
 the database, generates the record layout, loads the file (not going 
 through
 R but outside of R so R does not slow it down) and extracts the
 portion you want into R issuing the appropriate calls to RSQLite/DBI and
 destroying the database afterwards all automatically.  When you
 install sqldf it automatically installs RSQLite and the SQLite database
 itself so the entire installation is just one line: 
 install.packages(sqldf)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Use of [:alnum:] or . in gsub() etc..

2009-01-16 Thread ppaarrkk


test = c ( AAABBB, CCC )


This works :

gsub ( [A-Z], 2, test )



None of these do :

gsub ( [A-Z], [:alnum:], test )
gsub ( [A-Z], [[:alnum:]], test )
gsub ( [A-Z], [:alnum:], test )
gsub ( [A-Z], [[:alnum:]], test )
gsub ( [A-Z], ^[:alnum:]$, test )



What am I doing wrong, please ?

-- 
View this message in context: 
http://www.nabble.com/Use-of--%3Aalnum%3A--or-.-in-gsub%28%29-etc..-tp21502786p21502786.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Gabor Grothendieck
If that refers to using a database on disk to temporarily hold
the file then example 6 on the home page shows it, as mentioned,
and you may wish to look at the other examples there too and
there is further documentation in the ?sqldf help file.

On Fri, Jan 16, 2009 at 11:11 AM, Gundala Viswanath gunda...@gmail.com wrote:
 Hi,

 Unless you specify an in-memory database the database is stored on disk.

 Thanks for your explanation.
 I just downloaded 'sqldf'.

 Where can I find the option for that? In sqldf I can't see the command.

 I looked at:
 envir = parent.frame()

 doesn't appear to be the one.

 - Gundala Viswanath
 Jakarta - Indonesia


 On Fri, Jan 16, 2009 at 10:59 AM, Gundala Viswanath gunda...@gmail.com 
 wrote:
 Hi Gabor,

 the file itself is read  into a database

 The above doesn't use RAM memory?

 Rgds,
 GV.

 without ever going through R so your memory requirements correspond to what
 you extract, not the size of the file.

 On Fri, Jan 16, 2009 at 10:49 AM, Gundala Viswanath gunda...@gmail.com 
 wrote:
 Hi Gabor,

 Do you mean storing data in sqldf', doesn't take memory?
 For example, I have 3GB data file. with standard R object using 
 read.table()
 the object size will explode twice ~6GB. My current 4GB RAM
 cannot handle that.

 Do you mean with sqldf, this is not the issue?
 Why is that?

 Sorry for my naive question.

 - Gundala Viswanath
 Jakarta - Indonesia



 On Fri, Jan 16, 2009 at 9:09 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Fri, Jan 16, 2009 at 5:52 AM, r...@quantide.com r...@quantide.com 
 wrote:
 I agree on the database solution.
 Database are the rigth tool to solve this kind of problem.
 Only consider the start up cost of setting up the database. This could 
 be a
 very time consuming task if someone is not familiar with database
 technology.

 Using sqldf as mentioned previously on this thread allows one to use
 the SQLite database with no setup at all.  sqldf automatically creates
 the database, generates the record layout, loads the file (not going 
 through
 R but outside of R so R does not slow it down) and extracts the
 portion you want into R issuing the appropriate calls to RSQLite/DBI and
 destroying the database afterwards all automatically.  When you
 install sqldf it automatically installs RSQLite and the SQLite database
 itself so the entire installation is just one line: 
 install.packages(sqldf)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of [:alnum:] or . in gsub() etc..

2009-01-16 Thread Marc Schwartz
on 01/16/2009 10:13 AM ppaarrkk wrote:
 
 test = c ( AAABBB, CCC )
 
 
 This works :
 
 gsub ( [A-Z], 2, test )
 
 
 
 None of these do :
 
 gsub ( [A-Z], [:alnum:], test )
 gsub ( [A-Z], [[:alnum:]], test )
 gsub ( [A-Z], [:alnum:], test )
 gsub ( [A-Z], [[:alnum:]], test )
 gsub ( [A-Z], ^[:alnum:]$, test )
 
 
 
 What am I doing wrong, please ?

The syntax ordering is off. You are in effect trying to replace the
characters A-Z with the character vector [[:alnum:]]. Thus:

 gsub ([A-Z], [[:alnum:]],  test )
[1] [[:alnum:]][[:alnum:]][[:alnum:]][[:alnum:]][[:alnum:]][[:alnum:]]
[2] [[:alnum:]][[:alnum:]][[:alnum:]]


Recall the basic syntax for gsub() is:

  gsub(SearchPattern, ReplacementVector, Source)


What you want is:

 gsub ([[:alnum:]], 2 , test )
[1] 22 222


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing file to run Rcmd batch on Windows

2009-01-16 Thread Duncan Murdoch

On 1/16/2009 10:51 AM, Brigid Mooney wrote:

Hi,

I'm trying to run an R script using Rcmd Batch from the command line on a
Windows Vista machine.  I am using R version 2.8.1.

I installed the batch files 4-3 found at
http://cran.r-project.org/contrib/extra/batchfiles/ and added them to my
path.
I also had to install the latest version of perl (it's Strawberry perl if
that makes a difference) and have added this to my path.

Now when I run the command: Rcmd batch TestBatch.R TestOutput.txt from the
command line, I get the error:

Can't open perl script C:\Progra~1\R\R-28~1.0\bin\batch: No such file or
directory

Just for reference, TestBatch.R contains only one line: print(hello world)

Does anyone have any idea on what this file is that I might be missing?  Or
is there some other mistake I'm making in trying to run the a script from
the command line.


Rcmd runs some commands based on internal matches to the name, and if 
not found there, it looks in the file system.  So there is no internal 
command batch, and no external command of that name either.


The actual name (as Rcmd --help will tell you) is BATCH, and it is 
implemented as an internal command, so case matters.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave documents have corrupted double quotes

2009-01-16 Thread Paul Johnson
I'm attaching a file foo.Rnw and I'm hoping some of you might run it
through your R  latex systems to find out if the double-quotes in
typewriter font turn out as black boxes (as they do for me).  If you
don't use Sweave, but you have a system with a working version of R
and LaTeX, the file gives the instructions you need to use to process
the file. The

The file itself explains the problem. You can see the flawed output on
my web site

http://pj.freefaculty.org/latex/foo.pdf

I'm running Ubuntu Linux 8.10 with R 2.8.1 and TexLive 2007 (which is
provided with the distribution).

This is not a new problem, I noticed it two years ago while using
TeTeX on Fedora Linux, and so I doubt that this is specific to
TeXLive.  Back then, I took the path of resistance and stopped using
the typewriter font.  That is becoming inconvenient, however.

I would sincerely appreciate any pointers you have.

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave documents have corrupted double quotes

2009-01-16 Thread Ben Bolker
Paul Johnson pauljohn32 at gmail.com writes:

 
 I'm attaching a file foo.Rnw and I'm hoping some of you might run it
 through your R  latex systems to find out if the double-quotes in
 typewriter font turn out as black boxes (as they do for me).  If you
 don't use Sweave, but you have a system with a working version of R
 and LaTeX, the file gives the instructions you need to use to process
 the file. The
 

  I can't help except to say that I replicated the problem (but I'm
running exactly the same OS/R combination so maybe that's not too 
surprising.  Oddly enough upquote.sty says explicitly

 It does not affect \tt, \texttt, etc.

But that seems to be false.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave documents have corrupted double quotes

2009-01-16 Thread Vincent Goulet

Paul,

The file did not make it to the list.

Did you try loading Sweave with the 'noae' option, that is:

\usepackage[noae]{Sweave}

This *may* solve your issue.

HTH Vincent

Le ven. 16 janv. à 11:31, Paul Johnson a écrit :


I'm attaching a file foo.Rnw and I'm hoping some of you might run it
through your R  latex systems to find out if the double-quotes in
typewriter font turn out as black boxes (as they do for me).  If you
don't use Sweave, but you have a system with a working version of R
and LaTeX, the file gives the instructions you need to use to process
the file. The

The file itself explains the problem. You can see the flawed output on
my web site

http://pj.freefaculty.org/latex/foo.pdf

I'm running Ubuntu Linux 8.10 with R 2.8.1 and TexLive 2007 (which is
provided with the distribution).

This is not a new problem, I noticed it two years ago while using
TeTeX on Fedora Linux, and so I doubt that this is specific to
TeXLive.  Back then, I took the path of resistance and stopped using
the typewriter font.  That is becoming inconvenient, however.

I would sincerely appreciate any pointers you have.

--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---
  Vincent Goulet
  Acting Chair, Associate Professor
  École d'actuariat
  Université Laval, Québec
  vincent.gou...@act.ulaval.ca   http://vgoulet.act.ulaval.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2009-01-16 Thread Henning Wildhagen
Dear users,

i just installed the lastest version of R, 2.8.1 on my computer (OS Windows 
XP). Then i tried to update the packages copied from my old R version by

update.packages(ask=F)

However i get the following warning:

Warning: unable to access index for repository 
http://cran.ch.r-project.org/bin/windows/contrib/2.8
Warning: unable to access index for repository 
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.8;

and the updating fails.

Is it a server problem of CRAN? 
Thanks for your help,

Henning

-- 
Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL 
für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave documents have corrupted double quotes

2009-01-16 Thread Paul Johnson
On Fri, Jan 16, 2009 at 10:43 AM, David Winsemius
dwinsem...@comcast.net wrote:
 Dear Dr Johnson;


 I'm not sure if you get copies of your posts. If you do can you check to see
 if the list-server kept the attachment? My copy did not have one.

 --
 Best
 David winsemius


Hm. Well, I do get the attachment, and don't understand why you don't.

But if you are willing, you can get the file here, same directory as
the dvi and pdf output:

http://pj.freefaculty.org/latex/foo.Rnw






-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] installing mclust and flexmix on linux

2009-01-16 Thread Tim F Liao
I've been trying to install some R packages such as mclust and flexmix on linux 
but have had the following error messages.

 I've been trying to install mclust on my notebook which has linpus linux lite 
 os and I have installed R as well as some packages all right.  However, when 
 I tried to install mclust, it gave me the following messages.  Any 
 suggestions?

Tim



 install.packages('mclust','/usr/lib/R/library',destdir='/usr/tmp')
trying URL 'http://cran.cnr.Berkeley.edu/src/contrib/mclust_3.1-10.tar.gz'
Content type 'application/x-gzip' length 231992 bytes (226 Kb)
opened URL
 ==
downloaded 226 Kb

* Installing *source* package 'mclust' ...
** libs
WARNING: R include directory is empty -- perhaps need to install R-devel.rpm or 
similar
/usr/lib/R/bin/SHLIB: line 162: make: command not found
ERROR: compilation failed for package 'mclust'
** Removing '/usr/lib/R/library/mclust'
Updating HTML index of packages in '.Library'
Warning message:
In install.packages(mclust, /usr/lib/R/library, destdir = /usr/tmp) :
 installation of package 'mclust' had non-zero exit status

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2009-01-16 Thread ursachi
Dear all,

Can anybody help me with an RExcel tutorial? Maybe some example on which
functions can be used/how to use it... I have installed it on my computer,
using the R(D)COM server.

Thank you all in advance,
Irina Ursachi.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using optim with exponential power distribution

2009-01-16 Thread Ronald Bialozyt
Hello,

I am trying to fit a exponential power distribution 

y = b/(2*pi*a^2*gamma(2/b))*exp(-(x/a)^b)

to a bunch of data for x and y I have in a table.
 data
   x y 
1 2527 
2 7559 
3125   219 
...
25912925 1
26012975 0

I know optim should do a minimisation, therefor I used as the 
optimisation function

opt.power - function(val, x, y) { 
   a - val[1]; 
   b - val[2]; 
   sum(y - b/(2*pi*a^2*gamma(2/b))*exp(-(x/a)^b));
}

I call: (with xm and ym the data from the table)

a1 - c(0.2, 100)
opt - optim(a1, opt.power, method=BFGS, x=xm, y=ym)

but no optimisation of the parameter in a1 takes place.
Any ideas?

-- 
Ciao
Ronald

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faster version of split()?

2009-01-16 Thread David Winsemius

Henrique's solution seems sensible. Another might be:

 df = data.frame(x = sample(7:9, 10, rep = T), y = sample(1:5, 10,  
rep = T))

 table(df)
   y
x   1 2 3 4 5
  7 1 0 1 0 2
  8 0 1 0 0 1
  9 0 1 1 2 0

 rowSums(table(df) 0)
7 8 9
3 2 3


#-same as Henrique's
 count - function(x) length(unique(na.omit(x)))
 with(df, tapply(y, x, count))
7 8 9
3 2 3
--
David Winsemius

On Jan 16, 2009, at 5:10 AM, Simon Pickett wrote:


Hi all,

I want to calculate the number of unique observations of y in each  
level of x from my data frame df.


this does the job but it is very slow for this big data frame  
(159503 rows, 11 columns).


group.list - split(df$y,df$x)
count - function(x) length(unique(na.omit(x)))
sapply(group.list, count, USE.NAMES=TRUE)

I couldnt find the answer searching for slow split and split  
time on help forum.


I am running R version 2.2.1, on a machine with 4gb of memory and  
I'm using windows 2000.


thanks in advance,

Simon.







- Original Message - From: Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no 


To: Gundala Viswanath gunda...@gmail.com
Cc: R help r-h...@stat.math.ethz.ch
Sent: Friday, January 16, 2009 9:30 AM
Subject: Re: [R] Value Lookup from File without Slurping


you might try to iteratively read a limited number of line of lines  
in a

batch using readLines:

# filename, the name of your file
# n, the maximal count of lines to read in a batch
connection = file(filename, open=rt)
while (length(lines - readLines(con=connection, n=n))) {
 # do your stuff here
}
close(connection)

?file
?readLines

vQ


Gundala Viswanath wrote:

Dear all,

I have a repository file (let's call it repo.txt)
that contain two columns like this:

# tag  value
AAA0.2
AAT0.3
AAC   0.02
AAG   0.02
ATA0.3
ATT   0.7

Given another query vector



qr - c(AAC, ATT)



I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object  
(e.g. hash).

Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length  30 bp),  and my PC memory is too small to keep it.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Updating packages under R 2.8.1

2009-01-16 Thread Henning Wildhagen
Dear users,

i just installed the lastest version of R, 2.8.1 on my computer (OS Windows 
XP). Then i tried to update the packages copied from my old R version by

update.packages(ask=F)

However i get the following warning:

Warning: unable to access index for repository 
http://cran.ch.r-project.org/bin/windows/contrib/2.8
Warning: unable to access index for repository 
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.8;

and the updating fails.

Is it a server problem of CRAN? 
Thanks for your help,

Henning
 


-- 
Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL 
für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a
-- 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave documents have corrupted double quotes

2009-01-16 Thread Paul Johnson
On Fri, Jan 16, 2009 at 11:06 AM, Vincent Goulet
vincent.gou...@act.ulaval.ca wrote:
 Paul,

 The file did not make it to the list.

 Did you try loading Sweave with the 'noae' option, that is:

\usepackage[noae]{Sweave}

 This *may* solve your issue.

 HTH Vincent


Wow. That does fix it. I bow to you.  I need to learn how this option
can be put into the Rnw file itself .

Are you the same Vincent Goulet who offers the customized Emacs for
windows? (http://vgoulet.act.ulaval.ca/ressources/emacs).  If so,
thanks again for that. Keep up the good work.  It has saved me and my
students tons of time setting up Windows systems.


-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2009-01-16 Thread milton ruser
Dear Henning,
Try other repositories.

Best wishes,

miltinho
brazil

On Fri, Jan 16, 2009 at 2:08 PM, Henning Wildhagen hwildha...@gmx.dewrote:

 Dear users,

 i just installed the lastest version of R, 2.8.1 on my computer (OS Windows
 XP). Then i tried to update the packages copied from my old R version by

 update.packages(ask=F)

 However i get the following warning:

 Warning: unable to access index for repository
 http://cran.ch.r-project.org/bin/windows/contrib/2.8
 Warning: unable to access index for repository
 http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.8;

 and the updating fails.

 Is it a server problem of CRAN?
 Thanks for your help,

 Henning

 --
 Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL
 für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave documents have corrupted double quotes

2009-01-16 Thread David Winsemius

Looking at the display I see this line:

\texttt{Typewriter Font has ``double quotes''}

... displayed with leading backquotes but trailing singlequotes.  
Was that intended?


--
David Winsemius
On Jan 16, 2009, at 12:21 PM, Paul Johnson wrote:


On Fri, Jan 16, 2009 at 10:43 AM, David Winsemius
dwinsem...@comcast.net wrote:

Dear Dr Johnson;


I'm not sure if you get copies of your posts. If you do can you  
check to see

if the list-server kept the attachment? My copy did not have one.

--
Best
David winsemius



Hm. Well, I do get the attachment, and don't understand why you don't.

But if you are willing, you can get the file here, same directory as
the dvi and pdf output:

http://pj.freefaculty.org/latex/foo.Rnw








--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using optim with exponential power distribution

2009-01-16 Thread Stefan Evert

I know optim should do a minimisation, therefor I used as the
optimisation function

opt.power - function(val, x, y) {
  a - val[1];
  b - val[2];
  sum(y - b/(2*pi*a^2*gamma(2/b))*exp(-(x/a)^b));
}

I call: (with xm and ym the data from the table)

a1 - c(0.2, 100)
opt - optim(a1, opt.power, method=BFGS, x=xm, y=ym)

but no optimisation of the parameter in a1 takes place.
Any ideas?


It looks to me like your optimising the _average_ of the differences  
between y and the function, so as long as positive and negative  
differences balance out you get a cost value of 0 (and you can make it  
even smaller if the fitted function is much larger than the actual y  
values, so all differences are negative).


You probably wanted to minimise the squared errors:

sum((y - b/(2*pi*a^2*gamma(2/b))*exp(-(x/a)^b)))^2)




Best regards,
Stefan Evert

[ stefan.ev...@uos.de | http://purl.org/stefan.evert ]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted Kaplan-Meier Statistics

2009-01-16 Thread Ritwik Sinha
Dear All,

I could not locate an implementation of the Weighted Kaplan-Meier
Statistics proposed by Pepe and Fleming, Biometrics. 1989
Jun;45(2):497-507 (http://www.ncbi.nlm.nih.gov/pubmed/2765634) in R.

I am wondering if anyone is aware of a R implementation of the test
statistics proposed in the paper.

Thanks,

Ritwik Sinha
ritwik.si...@gmail.com | +12033042111 | http://ritwik.sinha.googlepages.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with extractPrediction in package caret

2009-01-16 Thread Max Kuhn
The issue is the usage of extractPrediction.

   expred - extractPrediction(rftrain)

should really be

   expred - extractPrediction(list(rftrain))

Since this function is intended to get predictions across multiple
models, the man file has a description of the first argument to teh
funtion being a list of objects of the class train.

-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faster version of split()?

2009-01-16 Thread Peter Dalgaard

Simon Pickett wrote:

Hi all,

I want to calculate the number of unique observations of y in each 
level of x from my data frame df.


this does the job but it is very slow for this big data frame (159503 
rows, 11 columns).


group.list - split(df$y,df$x)
count - function(x) length(unique(na.omit(x)))
sapply(group.list, count, USE.NAMES=TRUE)


wouldn't it do with something like

with(df,table(x, is.na(y)))[,1]

or

with(df, tapply(!is.na(y), x, sum))

?


I couldnt find the answer searching for slow split and split time on 
help forum.


I am running R version 2.2.1, on a machine with 4gb of memory and I'm 
using windows 2000.


thanks in advance,

Simon.







- Original Message - From: Wacek Kusnierczyk 
waclaw.marcin.kusnierc...@idi.ntnu.no

To: Gundala Viswanath gunda...@gmail.com
Cc: R help r-h...@stat.math.ethz.ch
Sent: Friday, January 16, 2009 9:30 AM
Subject: Re: [R] Value Lookup from File without Slurping



you might try to iteratively read a limited number of line of lines in a
batch using readLines:

# filename, the name of your file
# n, the maximal count of lines to read in a batch
connection = file(filename, open=rt)
while (length(lines - readLines(con=connection, n=n))) {
  # do your stuff here
}
close(connection)

?file
?readLines

vQ


Gundala Viswanath wrote:

Dear all,

I have a repository file (let's call it repo.txt)
 that contain two columns like this:

# tag  value
AAA0.2
AAT0.3
AAC   0.02
AAG   0.02
ATA0.3
ATT   0.7

Given another query vector



qr - c(AAC, ATT)



I would like to find the corresponding value for each query above,
yielding:

0.02
0.7

However, I want to avoid slurping whole repo.txt into an object (e.g. 
hash).

Is there any ways to do that?

The reason I want to do that because repo.txt is very2 large size
(milions of lines,
with tag length  30 bp),  and my PC memory is too small to keep it.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Updating packages under R 2.8.1

2009-01-16 Thread Uwe Ligges



Henning Wildhagen wrote:

Dear users,

i just installed the lastest version of R, 2.8.1 on my computer (OS Windows 
XP). Then i tried to update the packages copied from my old R version by



update.packages(ask=F)


However i get the following warning:

Warning: unable to access index for repository 
http://cran.ch.r-project.org/bin/windows/contrib/2.8
Warning: unable to access index for repository 
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.8;


and the updating fails.



Check your internet connection and your proxy settings as well as the 
correpsonding entries in the R for Windows FAQs.


Uwe Ligges


Is it a server problem of CRAN? 
Thanks for your help,


Henning
 







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing mclust and flexmix on linux

2009-01-16 Thread Uwe Ligges



Tim F Liao wrote:

I've been trying to install some R packages such as mclust and flexmix on linux 
but have had the following error messages.


I've been trying to install mclust on my notebook which has linpus linux lite 
os and I have installed R as well as some packages all right.  However, when I 
tried to install mclust, it gave me the following messages.  Any suggestions?


Tim



 install.packages('mclust','/usr/lib/R/library',destdir='/usr/tmp')
trying URL 'http://cran.cnr.Berkeley.edu/src/contrib/mclust_3.1-10.tar.gz'
Content type 'application/x-gzip' length 231992 bytes (226 Kb)
opened URL
 ==
downloaded 226 Kb

* Installing *source* package 'mclust' ...
** libs
WARNING: R include directory is empty -- perhaps need to install R-devel.rpm or 
similar



So, have you installed the header files / include directory?

Uwe Ligges




/usr/lib/R/bin/SHLIB: line 162: make: command not found
ERROR: compilation failed for package 'mclust'
** Removing '/usr/lib/R/library/mclust'
Updating HTML index of packages in '.Library'
Warning message:
In install.packages(mclust, /usr/lib/R/library, destdir = /usr/tmp) :
 installation of package 'mclust' had non-zero exit status

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Memory allocation

2009-01-16 Thread Gabriel Margarido
Hello everyone,

I have the following issue: one function generates a very big array (can be
more than 1 Gb) and returns a few variables, including this big one. Memory
allocation is OK while the function is running, but the final steps make
some copies that can be problematic. I looked for a way to return the values
without copying (even tried Rmemprof), but without success. Any ideas?
The code looks like this:

myfunc - function() {
...
bigarray - ...
...
final - list(..., bigarray=bigarray, ...)
class(final) - myfunc
final
}

Thank you in advance,
Gabriel.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glmer documentation

2009-01-16 Thread Raphaelle

Hello,

I am fitting a gmler using poisson, and I was looking for a documentation to
interpret correctly the output. I'm quite a beginner with these kind of
models.
I couldn't find something in the lme4 package manual. and on the internet
neither...

Thank you,

Raphaelle
-- 
View this message in context: 
http://www.nabble.com/glmer-documentation-tp21506036p21506036.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PHP and R

2009-01-16 Thread Applejus

Hi, 

I know I've already asked this question, but I am really getting trouble
getting a PHP document execute an R function on windows.

I would appreciate if someone could give me a simple example code where a
php calls an R function and passes to it arguments, specifying also how to
set up the paths etc... (should the .r file be in the www directory, and
what settings should be done?)


Thanks!

-- 
View this message in context: 
http://www.nabble.com/PHP-and-R-tp21507207p21507207.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing file to run Rcmd batch on Windows

2009-01-16 Thread Gabor Grothendieck
Regarding Perl, the batchfiles distribution batch files do not use Perl but
R's own Rcmd.exe does.  Based on comments recently I understand that Perl will
be eliminated from the R batch scripts soon but in the meantime if you install
Rtools (which is a set of tools that includes perl and is simple to install via
its GUI installer similar to R's) from

http://www.murdoch-sutherland.com/Rtools/

then Rcmd.bat will automatically detect it so you don't have to set
a path to Perl.

On Fri, Jan 16, 2009 at 11:00 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 Try writing BATCH in upper case.


 On Fri, Jan 16, 2009 at 10:51 AM, Brigid Mooney bkmoo...@gmail.com wrote:
 Hi,

 I'm trying to run an R script using Rcmd Batch from the command line on a
 Windows Vista machine.  I am using R version 2.8.1.

 I installed the batch files 4-3 found at
 http://cran.r-project.org/contrib/extra/batchfiles/ and added them to my
 path.
 I also had to install the latest version of perl (it's Strawberry perl if
 that makes a difference) and have added this to my path.

 Now when I run the command: Rcmd batch TestBatch.R TestOutput.txt from the
 command line, I get the error:

 Can't open perl script C:\Progra~1\R\R-28~1.0\bin\batch: No such file or
 directory

 Just for reference, TestBatch.R contains only one line: print(hello world)

 Does anyone have any idea on what this file is that I might be missing?  Or
 is there some other mistake I'm making in trying to run the a script from
 the command line.

 Thanks,
 -Brigid

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory allocation

2009-01-16 Thread Duncan Murdoch

On 1/16/2009 12:46 PM, Gabriel Margarido wrote:

Hello everyone,

I have the following issue: one function generates a very big array (can be
more than 1 Gb) and returns a few variables, including this big one. Memory
allocation is OK while the function is running, but the final steps make
some copies that can be problematic. I looked for a way to return the values
without copying (even tried Rmemprof), but without success. Any ideas?
The code looks like this:

myfunc - function() {
...
bigarray - ...
...
final - list(..., bigarray=bigarray, ...)
class(final) - myfunc
final
}

Thank you in advance,


I believe this will do less copying, but I haven't profiled it to be 
sure.  Replace the last three lines with this one statement:


structure(list(..., bigarray=bigarray, ...),
   class = myfunc)

If that doesn't help, then you really need to determine where the 
copying is happening: you can use Rprofmem() to do that.



Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bootstrap validation of LR error message

2009-01-16 Thread A Van Dyke

when i try to validate my logistic regression model:

fit-glm(y~x,binomial,data=dataname,x=TRUE,y=TRUE)
validate(fit,method=boot,B=150,...)

i get the following error message:

Error in UseMethod(validate) : no applicable method for validate

any insight would be appreciated.  many thanks!
-- 
View this message in context: 
http://www.nabble.com/bootstrap-validation-of-LR-error-message-tp21507695p21507695.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficiency challenge: MANY subsets

2009-01-16 Thread Johannes Graumann
Thanks. Very elegant, but doesn't solve the problem of the outer for loop, 
since I now would rewrite the code like so:

fragments - list()
for(iN in seq(length(sequences))){
  cat(paste(iN,\n))
  fragments[[iN]] - 
lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, as.list(g))])
}

still very slow for length(sequences) ~ 7000.

Joh

On Friday 16 January 2009 14:23:47 Henrique Dallazuanna wrote:
 Try this:

 lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, as.list(g))])

 On Fri, Jan 16, 2009 at 11:06 AM, Johannes Graumann 

 johannes_graum...@web.de wrote:
  Hello,
 
  I have a list of character vectors like this:
 
  sequences - list(
 
  
  c(M,G,L,W,I,S,F,G,T,P,P,S,Y,T,Y,L,L,I
 ,M,
 
  
  N,H,K,L,L,L,I,N,N,N,N,L,T,E,V,H,T,Y,
 F, N,I,N,I,N,I,D,K,M,Y,I,H,*)
  )
 
  and another list of subset ranges like this:
 
  indexes - list(
   list(
 c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51)
   )
  )
 
  What I now want to do is to subset each entry in sequences
  (sequences[[1]]) with all ranges in the corresponding low level list in
  indexes (indexes[[1]]). Here is what I came up with.
 
  fragments - list()
  for(iN in seq(length(sequences))){
   cat(paste(iN,\n))
   tmpFragments - sapply(
 indexes[[iN]],
 function(x){
   sequences[[iN]][seq.int(x[1],x[2])]
 }
   )
   fragments[[iN]] - tmpFragments
  }
 
  This works fine, but sequences contains thousands of entries and the
  corresponding indexes are sometimes hundreds of ranges long, so this
  whole
  process is EXTREMELY inefficient.
 
  Does somebody out there take the challenge and show me a way on how to
  speed
  this up?
 
  Thanks for any hints,
 
  Joh
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



signature.asc
Description: This is a digitally signed message part.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] specifying model terms when using predict

2009-01-16 Thread VanHezewijk, Brian
I've recently encountered an issue when trying to use the predict.glm
function.

 

I've gotten into the habit of using the dataframe$variablename method of
specifying terms in my model statements.  I thought this unambiguous
notation would be acceptable in all situations but it seems models
written this way are not accepted by the predict function.  Perhaps
others have encountered this problem as well.

 

The code below illustrates the issue.

 

 

##

## linear model example

 

# this works

 x-1:100

 y-2*x

 

 lm1-glm(y~x)

 pred1-predict(lm1,newdata=data.frame(x=101:150))

 

## so does this

 x-1:100

 y-2*x

 orig.df-data.frame(x1=x,y1=y)

 

 lm1-glm(y1~x1,data=orig.df)

 pred1-predict(lm1,newdata=data.frame(x1=101:150))

 

## this does not run

 x-1:100

 y-2*x

 orig.df-data.frame(x1=x,y1=y)

 

 lm1-glm(orig.df$y1~orig.df$x1,data=orig.df)

 pred1-predict(lm1,newdata=data.frame(x1=101:150))

 

 

The final statement generates the following warning:

 

Warning message:

'newdata' had 50 rows but variable(s) found have 100 rows

 

 

Hope this is of some help.

 

 

 

Brian Van Hezewijk 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrap validation of LR error message

2009-01-16 Thread Marc Schwartz
on 01/16/2009 02:19 PM A Van Dyke wrote:
 when i try to validate my logistic regression model:
 
 fit-glm(y~x,binomial,data=dataname,x=TRUE,y=TRUE)
 validate(fit,method=boot,B=150,...)
 
 i get the following error message:
 
 Error in UseMethod(validate) : no applicable method for validate
 
 any insight would be appreciated.  many thanks!


You appear to be trying to use the validate() function from Frank
Harrell's Design package.

However, you are attempting to use it with a glm() created from R's
stats package. The two don't mix.

If you want to use Frank's validate() function, you will need to create
your logistic regression model using Frank's lrm() function.

See ?lrm and ?validate.lrm for more information.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error when running Kendall Package

2009-01-16 Thread gqkou

I am new to R and am trying to run data through using the Kendall package. 
My first question is that I have NA values for certain criterias, will that
be a problem or will they be ignored?

ie:FallSpring  Summer
  1988   NA  1.321   1.564
  1999   1.333  1.452NA

When I try to run the test, I get this error:  error: input must be ts
object
Thanks,

Kou 
-- 
View this message in context: 
http://www.nabble.com/Error-when-running-Kendall-Package-tp21507235p21507235.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] User input in batch mode

2009-01-16 Thread Sebastien Bihorel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice: how to have multiple wireframe nice intersection?

2009-01-16 Thread David Winsemius


On Jan 16, 2009, at 9:43 AM, Guillaume Chapron wrote:


Hello,

This code builds a simple example of 2 wireframes :

require(lattice)
x - c(1:10)
y - c(1:10)
g - expand.grid(x = 1:10, y = 1:10, gr = 1:2)
g$z - c(as.vector(outer(x,y,*)), rep(50,100))
wireframe(z ~ x * y, data = g, groups = gr, scales = list(arrows =  
FALSE))


However, the intersection between the wireframes is not properly  
drawn. Is there a way to fix this with lattice, or should I use   
another package more suitable for this?


Exactly what not properly drawn means is not stated. If it is the  
jagged intersection, then expanding the grid would seem to be one way  
forward. Here's 100 x 100:


 require(lattice)
 x -seq(1,10, len=100); y - seq(1,10, len=100)
 g - expand.grid(x = seq(1,10, len=100), y = seq(1,10, len=100), gr  
= 1:2)

 g$z - c(as.vector(outer(x,y,*)), rep(50,1))
 wireframe(z ~ x * y, data = g, groups = gr, scales = list(arrows =  
FALSE))


You do get some Moir`e effects, but the jagged intersection is no  
longer visible and the curvature is visible.


With 50*50 points it has a less obvious curvature to the intersection  
(but four times as fast).


require(lattice)
x -seq(1,10, len=50); y - seq(1,10, len=50)
g - expand.grid(x = x, y = y, gr = 1:2)
g$z - c(as.vector(outer(x,y,*)), rep(50,length(x)*length(y)))
wireframe(z ~ x * y, data = g, groups = gr, scales = list(arrows =  
FALSE))


Or you could emphasize the curvature by drawing it in. I'm not the guy  
to do that, but there is an example of adding contours to a wireframe  
plot in Sarkar's book. Figure 13.7:


http://lmdvr.r-forge.r-project.org/figures/figures.html
--
David Winsemius



Thanks!

Guillaume

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Winsorizing Multiple Variables

2009-01-16 Thread Karl Healey

Hi All,

I want to take a matrix (or data frame) and winsorize each variable.  
So I can, for example, correlate the winsorized variables.


The code below will winsorize a single vector, but when applied to  
several vectors, each ends up sorted independently in ascending order  
so that a given observation is no longer on the same row for each  
vector.


So I need to winsorize the variable but then return it to its original  
order. Or another solution that will take a data frame, wisorize each  
variable, and return a new data frame with all the variables in the  
original order.


Thanks for any help!

-Karl


#The function I'm working from

win-function(x,tr=.2,na.rm=F){

   if(na.rm)x-x[!is.na(x)]
   y-sort(x)
   n-length(x)
   ibot-floor(tr*n)+1
   itop-length(x)-ibot+1
   xbot-y[ibot]
   xtop-y[itop]
   y-ifelse(y=xbot,xbot,y)
   y-ifelse(y=xtop,xtop,y)
   win-y
   win
}

#Produces an example data frame, ss is the observation id, vars 1-5  
are the variables I want to winzorise.


ss 
= 
c 
(1 
: 
5 
);var1 
= 
rnorm 
(5 
);var2 
= 
rnorm 
(5 
);var3 
=rnorm(5);var4=rnorm(5);as.data.frame(cbind(ss,var1,var2,var3,var4))- 
data

data

#Winsorizes each variable, but sorts them independently so the  
observations no longer line up.


sapply(data,win)


___
M. Karl Healey
Ph.D. Student

Department of Psychology
University of Toronto
Sidney Smith Hall
100 St. George Street
Toronto, ON
M5S 3G3

k...@psych.utoronto.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Smooth periodic splines

2009-01-16 Thread Spencer Graves
 1.  RSiteSearch('{periodic spline}') produced 12 hits.  I looked 
at the first five and found that four of them seemed relevant to your 
question. 

 2.  The third hit in this list notes that the DierckxSpline 
package has periodic splines, while 'fda' recommends finite Fourier 
series for periodic functions;  this hit documents an as.fd function 
that approximates a 'dierckx' periodic object with a periodic 'fd' object. 

 Hope this helps. 
 Spencer


Duncan Murdoch wrote:

cmr.p...@gmail.com wrote:

Hello group!

Is there a package that allows to fit smooth *periodic* splines to
data? I'm interested in a function which combines the functionality of
smooth.spline and splines::periodicSpline.
  


I don't know one, but you could use the same technique that 
periodicSpline uses:  repeat a copy of the data to the left and right 
of the main copy (similarly replicating the knots if you want 
regression splines rather than smoothing splines), then fit to the 
augmented dataset.  I don't think it is guaranteed to be exactly 
periodic, but it will be very close.


There is also the periodic option to splinefun and you might be able 
to use it to construct a true periodic basis, but you'll have to work 
out some tricky details to get that right.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matching more than two vectors (?)

2009-01-16 Thread Juliane Struve
Dear listmembers,
 
I am trying to obtain values for pointdistance from another dataframe by 
matching UTMX and UTMY coordinates, but I am not sure how to introduce the 
second coordinate. 
 
PointDF$pointdistance=DistanceDF$distance[match(PointDF$UTMX,DistanceDF$UTMX  
PointDF$UTMY,DistanceDF$UTMY )]
 
is wrong but (hopefully) illustrates what I am trying to do. 
 
Could somebody help ?
 
Thank you very much.
 
 
Juliane 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Winsorizing Multiple Variables

2009-01-16 Thread David Winsemius
Might work better to determine top and bottom for each column with  
quantile() using an appropriate quantile option,  and then process  
each variable in place with your ifelse logic.


I did find a somewhat different definition of winsorization with no  
sorting in this code copied from a Patrick Burns posting from earlier  
this year on R-SIG-Finance;


function(x, winsorize=5) {
   s - mad(x) * winsorize
   top - median(x) + s
   bot - median(x) - s
   x[x  top] - top
   x[x  bot] - bot x }

--
David Winsemius
On Jan 16, 2009, at 3:50 PM, Karl Healey wrote:


Hi All,

I want to take a matrix (or data frame) and winsorize each variable.  
So I can, for example, correlate the winsorized variables.


The code below will winsorize a single vector, but when applied to  
several vectors, each ends up sorted independently in ascending  
order so that a given observation is no longer on the same row for  
each vector.


So I need to winsorize the variable but then return it to its  
original order. Or another solution that will take a data frame,  
wisorize each variable, and return a new data frame with all the  
variables in the original order.


Thanks for any help!

-Karl


#The function I'm working from

win-function(x,tr=.2,na.rm=F){

  if(na.rm)x-x[!is.na(x)]
  y-sort(x)
  n-length(x)
  ibot-floor(tr*n)+1
  itop-length(x)-ibot+1
  xbot-y[ibot]
  xtop-y[itop]
  y-ifelse(y=xbot,xbot,y)
  y-ifelse(y=xtop,xtop,y)
  win-y
  win
}

#Produces an example data frame, ss is the observation id, vars 1-5  
are the variables I want to winzorise.


ss 
= 
c 
(1 
: 
5 
);var1 
= 
rnorm 
(5 
);var2 
= 
rnorm 
(5 
);var3 
=rnorm(5);var4=rnorm(5);as.data.frame(cbind(ss,var1,var2,var3,var4))- 
data

data

#Winsorizes each variable, but sorts them independently so the  
observations no longer line up.


sapply(data,win)


___
M. Karl Healey
Ph.D. Student

Department of Psychology
University of Toronto
Sidney Smith Hall
100 St. George Street
Toronto, ON
M5S 3G3

k...@psych.utoronto.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Barchart in lattice package: controlling order of bars in plot and color of a selected bar

2009-01-16 Thread Matthew Pettis
Hi,

I'm using the lattice function 'barchart' to make a series of 4
histograms.  Currently, the y-axis values are graphed in order of the
y-axis variable.  I'd like to have the y-axis values sorted in
ascending order of the x-axis values so that the longest bar
horizontally is on top of the graph (in it's seciton) and the shortest
bar is on the bottom.  I can do this in 'barplot' with normal
graphics; how can I do this in the lattice barchart function?  In
addition, i use the 'col' parameter in 'barplot' to make one
particular value be a different color than the rest.  Is this sort of
control available in lattice's barchart?  I didn't see it when reading
the documentation.

Thanks,
Matt

### Regular barplot code that works
# w02-08 ar pre-sorted in descending Sum oder,
# barcolor column has names of colors to use.

opar - par(mfrow=c(1,4))
barplot(w02[,Sum], horiz=T, col=w02[,barcolor],
names.arg=w02[,sts.dist], main=2002)
barplot(w04[,Sum], horiz=T, col=w04[,barcolor],
names.arg=w04[,sts.dist], main=2004)
barplot(w06[,Sum], horiz=T, col=w06[,barcolor],
names.arg=w06[,sts.dist], main=2006)
barplot(w08[,Sum], horiz=T, col=w08[,barcolor],
names.arg=w08[,sts.dist], main=2008)

### Attempt with lattice
# work contains w02-08 stacked on top of each other

barchart(sts.dist[order(Sum),] ~ Sum | year, data=work, layout=c(4,1),
main=Ranking of best turnout by SD)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficiency challenge: MANY subsets

2009-01-16 Thread jim holtman
Try this one;  it is doing a list of 7000 in under 2 seconds:

  sequences - list(
+
+
+  c(M,G,L,W,I,S,F,G,T,P,P,S,Y,T,Y,L,L,I
+ ,M,
+
+
+  N,H,K,L,L,L,I,N,N,N,N,L,T,E,V,H,T,Y,F,
N,I,N,I,N,I,D,K,M,Y,I,H,*)
+  )



  indexes - list(
+   list(
+ c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51)
+   )
+  )

 indexes - rep(indexes,10)
 sequences - rep(sequences,7000)

 system.time({
+ fragments - lapply(indexes, function(.seq){
+ lapply(.seq, function(.range){
+ .range - seq(.range[1], .range[2])  # save since we use several times
+ lapply(sequences, '[', .range)
+ })
+ })
+ })
   user  system elapsed
   1.240.001.26




On Fri, Jan 16, 2009 at 3:16 PM, Johannes Graumann
johannes_graum...@web.de wrote:
 Thanks. Very elegant, but doesn't solve the problem of the outer for loop,
 since I now would rewrite the code like so:

 fragments - list()
 for(iN in seq(length(sequences))){
  cat(paste(iN,\n))
  fragments[[iN]] -
lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, as.list(g))])
 }

 still very slow for length(sequences) ~ 7000.

 Joh

 On Friday 16 January 2009 14:23:47 Henrique Dallazuanna wrote:
 Try this:

 lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, as.list(g))])

 On Fri, Jan 16, 2009 at 11:06 AM, Johannes Graumann 

 johannes_graum...@web.de wrote:
  Hello,
 
  I have a list of character vectors like this:
 
  sequences - list(
 
 
  c(M,G,L,W,I,S,F,G,T,P,P,S,Y,T,Y,L,L,I
 ,M,
 
 
  N,H,K,L,L,L,I,N,N,N,N,L,T,E,V,H,T,Y,
 F, N,I,N,I,N,I,D,K,M,Y,I,H,*)
  )
 
  and another list of subset ranges like this:
 
  indexes - list(
   list(
 c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51)
   )
  )
 
  What I now want to do is to subset each entry in sequences
  (sequences[[1]]) with all ranges in the corresponding low level list in
  indexes (indexes[[1]]). Here is what I came up with.
 
  fragments - list()
  for(iN in seq(length(sequences))){
   cat(paste(iN,\n))
   tmpFragments - sapply(
 indexes[[iN]],
 function(x){
   sequences[[iN]][seq.int(x[1],x[2])]
 }
   )
   fragments[[iN]] - tmpFragments
  }
 
  This works fine, but sequences contains thousands of entries and the
  corresponding indexes are sometimes hundreds of ranges long, so this
  whole
  process is EXTREMELY inefficient.
 
  Does somebody out there take the challenge and show me a way on how to
  speed
  this up?
 
  Thanks for any hints,
 
  Joh
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Winsorizing Multiple Variables

2009-01-16 Thread Michael Conklin
Don't sort y. Calculate xbot and xtop using
xtemp-quantile(y,c(tr,1-tr),na.rm=na.rm)
xbot-xtemp[1]
xtop-xtemp[2]

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Karl Healey
Sent: Friday, January 16, 2009 2:51 PM
To: r-help@r-project.org
Subject: [R] Winsorizing Multiple Variables

Hi All,

I want to take a matrix (or data frame) and winsorize each variable.
So I can, for example, correlate the winsorized variables.

The code below will winsorize a single vector, but when applied to
several vectors, each ends up sorted independently in ascending order
so that a given observation is no longer on the same row for each
vector.

So I need to winsorize the variable but then return it to its original
order. Or another solution that will take a data frame, wisorize each
variable, and return a new data frame with all the variables in the
original order.

Thanks for any help!

-Karl


#The function I'm working from

win-function(x,tr=.2,na.rm=F){

if(na.rm)x-x[!is.na(x)]
y-sort(x)
n-length(x)
ibot-floor(tr*n)+1
itop-length(x)-ibot+1
xbot-y[ibot]
xtop-y[itop]
y-ifelse(y=xbot,xbot,y)
y-ifelse(y=xtop,xtop,y)
win-y
win
}

#Produces an example data frame, ss is the observation id, vars 1-5
are the variables I want to winzorise.

ss
=
c
(1
:
5
);var1
=
rnorm
(5
);var2
=
rnorm
(5
);var3
=rnorm(5);var4=rnorm(5);as.data.frame(cbind(ss,var1,var2,var3,var4))-
 data
data

#Winsorizes each variable, but sorts them independently so the
observations no longer line up.

sapply(data,win)


___
M. Karl Healey
Ph.D. Student

Department of Psychology
University of Toronto
Sidney Smith Hall
100 St. George Street
Toronto, ON
M5S 3G3

k...@psych.utoronto.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >