Re: [R] (no subject)

2012-03-27 Thread James Muller
AJ:  This is something to learn a lesson from. A question, starved of
preparation, can't help anybody to help you.

James
On Mar 25, 2012 9:00 PM, Rolf Turner rolf.tur...@xtra.co.nz wrote:

 On 26/03/12 00:18, Anjana Thampi wrote:

 How do you decompose inequality in R, say by gender?


 This has to be one of the most meaningless and ill-expressed
 questions I've ever seen on this list.  And that's a high hurdle
 to clear.

cheers,

Rolf Turner

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help in replacing for llop

2012-03-27 Thread arunkumar1111
Hi 

I have records like like this

X1  X2  State
34  72  state1
9   63  state1
49  31  state1
60  34  state1
80  73  state1
60  20  state2
59  87  state2
88  20  state2
71  66  state2
65  56  state2
59  16  state1
60  100 state2


I want to get the summarize value like mean median histogram for X1 and X2
based on state. I'm using FOR loop for this.  Is there any method to remove
for loop and use apply or any alternatives


-
Thanks in Advance
Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/help-in-replacing-for-llop-tp4507939p4507939.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot of function seems to cut off near edge of domain

2012-03-27 Thread chad.mills
Hello helpful R folks,
I am simply trying to graph a quarter circle centered at the origin in the
first quadrant.  When I set the xlim of the plot to the radius of the
circle, the plot appears correct.  However, I'd like to see a slight
extension of the axes beyond the domain of the function itself.  When I do
this, a portion of the plot seems to be missing by the edge of the domain. 
Here is the code for both of the plots:

dev.off() 
plot.new()
#Set up two-figure plot
par(mfrow=c(1,2),pty='s')
g-function(x){sqrt(2500-x^2)}
#Figure 1, with xlim at the radius of the circle
plot(g,axes=F,xlim=c(0,50),ylim=c(0,50))
axis(1,pos=0)
axis(2,pos=0)
#Figure 2, with xlim beyond the radius of the circle
plot(g,axes=F,xlim=c(0,60),ylim=c(0,60))
axis(1,pos=0)
axis(2,pos=0)

Notice that the second graph doesn't appear to intersect the x-axis, while
the first one does.  Any ideas why that might be the case?  Here's an image
of what I see in case that's useful:

http://r.789695.n4.nabble.com/file/n4507954/Cut_off_Quarter_Circle.png 

Thanks in advance for the help!

-Chad Mills

--
View this message in context: 
http://r.789695.n4.nabble.com/Plot-of-function-seems-to-cut-off-near-edge-of-domain-tp4507954p4507954.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] y needing more than 2 functions

2012-03-27 Thread Rolf Turner

Another approach:

foo - function(t){
   bm - ceiling(t/15)
   s - cut(t,breaks=15*(0:bm),labels=1:bm)
   s - as.numeric(levels(s)[s])
   t^(s+1)
}

This idea can be generalised .

cheers,

Rolf Turner

On 27/03/12 15:31, R. Michael Weylandt wrote:

One way is to simply nest your ifelse()s:

y- ifelse(t  15, t^2, ifelse(t  30, t^3, t^4))

Michael

On Mon, Mar 26, 2012 at 7:48 PM, Aimee Jones
al...@hoyamail.georgetown.edu  wrote:

Dear all,

I'm aware if y has two separate functions (depending on the conditions
of x) you can use the ifelse function to separate y into two separate
functions depending on input. How do you do this if there a multiple
different conditions for x?

for example,

y fits the following between t0  t15-function(t) t^2, y fits
the following between t15  t30-  function(t) t^3, y fits the
following between t30t45---function(t) t^4 etc


Thanks for any help you are able to give,
yours sincerely,
Aimee

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] y needing more than 2 functions

2012-03-27 Thread Petr Savicky
On Mon, Mar 26, 2012 at 07:48:07PM -0400, Aimee Jones wrote:
 Dear all,
 
 I'm aware if y has two separate functions (depending on the conditions
 of x) you can use the ifelse function to separate y into two separate
 functions depending on input. How do you do this if there a multiple
 different conditions for x?
 
 for example,
 
 y fits the following between t0  t15-function(t) t^2, y fits
 the following between t15  t30- function(t) t^3, y fits the
 following between t30 t45---function(t) t^4 etc

Hi.

Try the following.

  bounds - c(0, 15, 30, 45)
  x - seq(4, 44, length=51)
  valfunc - cbind(x^2, x^3, x^4)
  indfunc - findInterval(x, bounds)
  y - valfunc[cbind(1:nrow(valfunc), indfunc)]

  # verify
  range(x[y == x^2]) # [1]  4.0 14.4
  range(x[y == x^3]) # [1] 15.2 29.6
  range(x[y == x^4]) # [1] 30.4 44.0

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] circles()

2012-03-27 Thread Jim Lemon

On 2012-03-26 05:09, John D. Muccigrosso wrote:

I cannot for the life of me figure this out:

What's the parameter to fill in with color circles made with
circles()? col changes the line color, but all I see in the help is a
reference to additional graphic parameters, and no examples via google.


Hi John,
Perhaps the draw.circle function (plotrix) will do what you want.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Export Created Variables to SPSS/.csv

2012-03-27 Thread Michael Bibo
Strassburger, Daniel Daniel_Strassburger at baylor.edu writes:

 
 I haven't been successful in converting my colleagues to the world of R yet
they wish to share collected data
 so that they may analyze it in SPSS.
 
 I know how to write to an SPSS file and it opens fine, but my problem is that
it only includes the existing data -
 none of the variables I created within R.

Daniel,

 Apologies if this is too obvious, but have you appended the new variables that
you have created in R to the original dataset before you export in SPSS or .csv
format?  This doesn't happen automatically in R.

Somthing like:

all.data - data.frame (imported.data, created.variable1, created.variable2, 
etc)



Michael Bibo
Queensland Health

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to test for the difference of means in population, please help

2012-03-27 Thread ali_protocol
Dear all,

Novice in statistics.

I have 2 experimental conditions. Each condition has ~400 points as its
response. Each condition is done in 4 repereats (so I have 2 x 400 x 4
points). 

I want to compare the means of two conditions and test whether they are same
or not. Which test should I use?

#populations
c = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)
b = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)

#means of repeats
c.mean= apply (c,2, mean)
b.mean= apply (b,2,mean)

#mean of experiment
c.mean.all= mean (c)
b.mean.all= mean (b)

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-test-for-the-difference-of-means-in-population-please-help-tp4508089p4508089.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Testing for difference of distribution of two population(newbie)

2012-03-27 Thread ali_protocol
Dear all,

I am novice in statistics.
I have two matrices (results from my experiments), each having ~400 point. I
want to test whether the points come from a same distribution or not.
Further, I have the results for each matrix (each experiment) in 4
replicates. What should I do?

# Repeats of experiments are in different columns
exp.1 = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)
exp.2 = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)


Thanks in advance

--
View this message in context: 
http://r.789695.n4.nabble.com/Testing-for-difference-of-distribution-of-two-population-newbie-tp4508092p4508092.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot of function seems to cut off near edge of domain

2012-03-27 Thread Matthieu Dubois
Dear Chad,

your problem is linked to (1) the function returning NaNs from x values 
greater than 50, and (2) the fact that the function is estimated on a 
predefined number of points.

Calling plot for a function object is basically a wrapper for curve(). Your 
function g() is evaluated on the whole xlim domain, which will return NaN 
values for x50 (Try g(60) ). In addition, curve() splits the x interval (here 
from 0 to 60) into a predifined number of points (n=101 is the default, see 
help(curve)) at which the function is estimated. In your code, the function is 
estimated at values x - seq(0, 60, length=101), and g(x) that are not NaN are 
plotted. The largest x value (from the sequence) that doesn't return a NaN is 
max(x[!is.nan(g(x))]), which is 49.8.

One way to solve it is to explicitly specify the domain used to estimate the 
function, by using the from and to arguments that are passed to curve():

#Figure 2, with xlim beyond the radius of the circle 
plot(g,axes=F,from=0, to =50, xlim=c(0, 60), ylim=c(0,60)) 
axis(1,pos=0) 
axis(2,pos=0)

HTH

Matthieu

Matthieu Dubois 
Post-doctoral researcher
Psychology Department
Université Libre de Bruxelles

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] row, col function but for a list (probably very easy question, cannot seem to find it though)

2012-03-27 Thread peter dalgaard

On Mar 26, 2012, at 17:33 , David Winsemius wrote:

 The usual approach to that problem is to use sapply:
 
 x - list()
 x - sapply(1:10, function(z) x[[z]] - 1:z )

Yikes!

If that works, it is only by coincidence (The pre-assignment to x only 
serves the purpose of allowing the [[-assignment inside the anonymous 
function, but the assignment is to a local copy which is deleted on exit, and 
the return value is the rhs of the assignment.) 

Please:

x - lapply(1:10, function(z) 1:z)

or even

x - lapply(1:10, seq_len)

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normalization of multi-value string variable

2012-03-27 Thread Alekseiy Beloshitskiy
Thank you so much, Jessica,

The specific of my case is that I have a very detailed variable 'Interests' 
which may have several thousands of possible values. Usually each customer has 
3-10 different interests. For example:
customer_id|...|interests
1001   |...| cycling, swimming, cooking
1002   |...| cooking, singing, dancing

Total number of possible distinct values is several thousands. I m curious how 
to use these interests in SVM (represent as a vector of real numbers with 
several thousands of elements?).

If you have any ideas please let me know.


Thank you,
-Alex


From: Jessica Streicher [j.streic...@micromata.de]
Sent: 27 March 2012 11:18
To: Alekseiy Beloshitskiy
Subject: Re: [R] normalization of multi-value string variable

Well, not sure what you mean with scaling and normalizing strings, but if you 
want to represent the interests as numbers, you can do something like this:

n-seq(1,length(unique(my_strings)))[factor(my_strings)]


Am 26.03.2012 um 18:50 schrieb Alekseiy Beloshitskiy:

Hi All,

I need to normalize/scale string variable which represents interests of 
customers (e.g., 'cycling, rollerblading, swimming' etc).

Does anybody know how to do this, I want then use it along with other numeric 
variables for SVM classification.

Appreciate for any advice.

-Alex

[[alternative HTML version deleted]]

__
R-help@r-project.orgmailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Velti anti-spam filter: Click 
herehttps://www.mailcontrol.com/sr/r0FnbR2LtoLTndxI!oX7UvIItv2OGGpT0AcqlhvMu8o1Dzu7YBkufzUjcExl8H5fIQg52m9U+4B6aunJTqVygQ==
 to report this email as spam.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SVM. How to use categorical attributes?

2012-03-27 Thread Alekseiy Beloshitskiy
Hi All,

Here is the case. I want to build classification model (SVM). Some of variables 
for this model are categorical attributes which represent words  (usually 3-10 
words - query for search in google). For example:
search_id | query_words|..| result
---+--+--+
1| how,to,grow,tree  |..| 4
2| smartfone,htc,buy,price |..| 7
3| buy,house,realty,london |..| 6
4| where,to,go,weekend,cinema |..| 4
...
As you can see, words in the query are disordered and may occur in different 
queries. Total number of unique words for all queries is several thousands.
The question is how to represent this variable (query_words) to use for SVM.

Thank you for any advices!

Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Completely Off Topic:Link to IOM report on use of -omics tests in clinical trials

2012-03-27 Thread Mike Marchywka





Thanks, I had totally missed this controversy but from quick read of summary 
the impact on open source analysis was unclear.Can you explain the punchline? I 
think many users of R have concluded the biggest problem in most analyses 
isfirst getting the data and then verfiying any results you derive, both issues 
that sound related to your post.
( The jumble below is illustrative of what hotmail has been doing with plain 
text, getting plain data withoutall the formatting junk is a recurring problem 
LOL).






#62; Date#58; Mon, 26 Mar 2012 22#58;38#58;56 #43;0100#13;#10;#62; 
From#58; iaingallagher#64;btopenworld.com#13;#10;#62; To#58; 
gunter.berton#64;gene.com#59; r-help#64;r-project.org#13;#10;#62; 
Subject#58; Re#58; #91;R#93; Completely Off Topic#58;Link to IOM report on 
use of #34;-omics#34; tests in clinical trials#13;#10;#62;#13;#10;#62; 
I followed this case while it was 
ongoing.#13;#10;#62;#13;#10;#62;#13;#10;#62; It was a very interesting 
example of basic mistakes but also #40;for me#41; of journal 
politicking.#13;#10;#62;#13;#10;#62;#13;#10;#62; Keith Baggerly and 
Kevin Coombes wrote a great paper - #34;DERIVING CHEMOSENSITIVITY FROM CELL 
LINES#58; FORENSIC BIOINFORMATICS AND REPRODUCIBLE RESEARCH IN HIGH-THROUGHPUT 
BIOLOGY#34; in The Annals of Applied Statistics #40;2009, Vol. 3, No. 4, 
1309#8211;1334#41; which explains some of the background and investigative 
work they had to do to bring those mistakes to light.!
 #13;#10;#62;#13;#10;#62;#13;#10;#62; 
Best#13;#10;#62;#13;#10;#62; 
iain#13;#10;#62;#13;#10;#62;#13;#10;#62;#13;#10;#62; - Original 
Message -#13;#10;#62; From#58; Bert Gunter 
#60;gunter.berton#64;gene.com#62;#13;#10;#62; To#58; 
r-help#64;r-project.org#13;#10;#62; Cc#58;#13;#10;#62; Sent#58; 
Monday, 26 March 2012, 19#58;12#13;#10;#62; Subject#58; #91;R#93; 
Completely Off Topic#58;Link to IOM report on use of #34;-omics#34; tests in 
clinical trials#13;#10;#62;#13;#10;#62; Warning#58; This has little 
directly to do with R, although R and related#13;#10;#62; tools #40;e.g. 
sweave and other reproducible research tools#41; have a#13;#10;#62; natural 
role to play.#13;#10;#62;#13;#10;#62; The IOM 
report#58;#13;#10;#62;#13;#10;#62; 
http#58;//www.iom.edu/Reports/2012/Evolution-of-Translational-Omics.aspx#13;#10;#62;#13;#10;#62;
 that arose out of the Duke Univ. genomics testing scandal ha!
 s been#13;#10;#62; released. My thanks to Keith Baggerly for forwar
ding this. I believe#13;#10;#62; that many R users in the medical research 
community will find this#13;#10;#62; interesting, and I hope I do not 
venture too far out of line by#13;#10;#62; passing on the link to readers of 
this list. It #42;#42;will#42;#42; have an#13;#10;#62; important impact 
on so-called Personalized Health Care #40;which I guess#13;#10;#62; affects 
all of us#41;, and open source analytical #40;statistical#41;#13;#10;#62; 
methodology is a central issue.#13;#10;#62;#13;#10;#62; For those 
interested, try the summary first.#13;#10;#62;#13;#10;#62; Best to 
all,#13;#10;#62; Bert#13;#10;#62;#13;#10;#62;#13;#10;#62; 
--#13;#10;#62;#13;#10;#62; Bert Gunter#13;#10;#62; Genentech 
Nonclinical Biostatistics#13;#10;#62;#13;#10;#62; Internal Contact 
Info#58;#13;#10;#62; Phone#58; 467-7374#13;#10;#62; 
Website#58;#13;#10;#62; 
http#58;//pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pd!
 b-biostatistics/pdb-ncb-home.htm#13;#10;#62;#13;#10;#62; 
__#13;#10;#62; 
R-help#64;r-project.org mailing list#13;#10;#62; 
https#58;//stat.ethz.ch/mailman/listinfo/r-help#13;#10;#62; PLEASE do read 
the posting guide 
http#58;//www.R-project.org/posting-guide.html#13;#10;#62; and provide 
commented, minimal, self-contained, reproducible 
code.#13;#10;#62;#13;#10;#62;#13;#10;#62; 
__#13;#10;#62; 
R-help#64;r-project.org mailing list#13;#10;#62; 
https#58;//stat.ethz.ch/mailman/listinfo/r-help#13;#10;#62; PLEASE do read 
the posting guide 
http#58;//www.R-project.org/posting-guide.html#13;#10;#62; and provide 
commented, minimal, self-contained, reproducible code.#13;#10;

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to enable Arial font for postcript/pdf figure on Windows?

2012-03-27 Thread antagomir
Hi Agnes and Camille (and help-list),

In Ubuntu 11.10 I needed to use su permissions to copy and gzip the *.afm
files manually into /usr/lib/R/library/grDevices/afm/ to get the Arial
embedding to work in R for postscript. 

Ie. after following the instructions by Agnes and Camille, I did 
  sudo cp arial*.afm /usr/lib/R/library/grDevices/afm/
  gzip /usr/lib/R/library/grDevices/afm/arial*.afm

Then the postscript toy example in this thread worked.

Leo

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-enable-Arial-font-for-postcript-pdf-figure-on-Windows-tp3017809p4508266.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data indexing issue...

2012-03-27 Thread HJ YAN
Dear R-help,

My dataset (which is a data frame, called 'Calender' here)  includes 365
rows representing 365 days for a year.  One column ('Season')contains
factor data representing seasons, e.g. spring, summer, autumn and winter.
Another column (called 'Day') contains data representing wether the day  is
a working day  (I use 'Wd' for short here)or weekend (I use 'Wkend' for
short here).


I want to seperate the index of the working days  and weekends for each
season. I used R commend which before for one criteria, for example, if I
use...


WdIndex-which(Calender$Day=='Wd')

that will gives a set of indeices of working days in the year.

I wonder in R could I use a combination of something such as 'AND' , 'OR'
(e.g. in MySQL) to set 'multi-criteria'  when selecting data. So for
example...

WinterWdIndex-which(Calender$Day=='Wd' AND Calender$Season==Winter)


I know the above syntax is wrong, and I checked '?which' which did not give
me an answer and also tried '?AND' but seems it doesn`t exist at all...


Many thanks!
HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting a data.frame to excel using sqlSave - adds a character ' to values

2012-03-27 Thread Juliette Fabre
Hello Tal, 

I have the same problem with the ' added to all my cells when exported into
Excel.

I can drop them manually but only one by one (the Find  Replace does not
work) ... So finally the exported Excel file can actually not be used by
scientists to draw graphs or whatever! 

Did you find a solution to this problem ?

Thanks, 

Juliette


--
View this message in context: 
http://r.789695.n4.nabble.com/Exporting-a-data-frame-to-excel-using-sqlSave-adds-a-character-to-values-tp1016523p4508239.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] detecting time out on download.file command

2012-03-27 Thread Hugh Shanahan
Hi,
   I'm working with a legacy R script which makes use of the
download.file command. We're having a problem that occasionally we get a
time out from a particular FTP site but the function that does this
doesn't pass that information back to the main function that calls it.
I'm aware that it is possible to set a timeout using the options command
but I don't know how to check if a timeout has been executed. If I put
the command into a try block could I get the information there ?

All the best,
Hugh

-- 

Hugh Shanahan   Department of Computer Science 
Lecturer in Bioinformatics  Room 246 McCrea Building
E-mail : hugh.shana...@rhul.ac.uk   Royal Holloway, 
Web : http://www.shanahanlab.orgUniversity of London
Tel : +44 (0)1784 443433Egham, Surrey TW20 0EX
Fax : +44 (0)1784 439786England, U.K.

PGP Key  http://www.cs.rhul.ac.uk/~hugh/PGP/public_key.asc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Discretization Package MDLP

2012-03-27 Thread Khaled_taalab
Dear All,

I have a dataset of eight variables with 156 records which I wish to
discretize using the MDLP algorithm. My issue is that I want to dictate the
number of bins the algorithm splits the data into (around 5), rather than
just allowing the algorithm to dictate this using the mdlp(data) command. 

Any help would be greatly appreciated. 

Kind Regards,

Khaled Taalab

--
View this message in context: 
http://r.789695.n4.nabble.com/Discretization-Package-MDLP-tp4508501p4508501.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data indexing issue...

2012-03-27 Thread Ivan Calandra

Hi HJ,

Take a look at ?; this is probably what you're looking for.

What you could also do is:
Calender[Calender$Day=='Wd'  Calender$Season==Winter, ]  # notice the 
last comma


This will subset directly without using which(); it might be helpful to you.

HTH,
Ivan

--
Ivan CALANDRA
Université de Bourgogne
UMR CNRS/uB 6282 Biogéosciences
6 Boulevard Gabriel
21000 Dijon, FRANCE
+33(0)3.80.39.63.06
ivan.calan...@u-bourgogne.fr
http://biogeosciences.u-bourgogne.fr/calandra


Le 27/03/12 12:32, HJ YAN a écrit :

Dear R-help,

My dataset (which is a data frame, called 'Calender' here)  includes 365
rows representing 365 days for a year.  One column ('Season')contains
factor data representing seasons, e.g. spring, summer, autumn and winter.
Another column (called 'Day') contains data representing wether the day  is
a working day  (I use 'Wd' for short here)or weekend (I use 'Wkend' for
short here).


I want to seperate the index of the working days  and weekends for each
season. I used R commend which before for one criteria, for example, if I
use...


WdIndex-which(Calender$Day=='Wd')

that will gives a set of indeices of working days in the year.

I wonder in R could I use a combination of something such as 'AND' , 'OR'
(e.g. in MySQL) to set 'multi-criteria'  when selecting data. So for
example...

WinterWdIndex-which(Calender$Day=='Wd' AND Calender$Season==Winter)


I know the above syntax is wrong, and I checked '?which' which did not give
me an answer and also tried '?AND' but seems it doesn`t exist at all...


Many thanks!
HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data indexing issue...

2012-03-27 Thread jim holtman
Why not use 'split' and get all the groups at once:

result - split(Calandra, list(Calandra$Day, Calandra$Season, drop = TRUE)

On Tue, Mar 27, 2012 at 7:43 AM, Ivan Calandra
ivan.calan...@u-bourgogne.fr wrote:
 Hi HJ,

 Take a look at ?; this is probably what you're looking for.

 What you could also do is:
 Calender[Calender$Day=='Wd'  Calender$Season==Winter, ]  # notice the
 last comma

 This will subset directly without using which(); it might be helpful to you.

 HTH,
 Ivan

 --
 Ivan CALANDRA
 Université de Bourgogne
 UMR CNRS/uB 6282 Biogéosciences
 6 Boulevard Gabriel
 21000 Dijon, FRANCE
 +33(0)3.80.39.63.06
 ivan.calan...@u-bourgogne.fr
 http://biogeosciences.u-bourgogne.fr/calandra


 Le 27/03/12 12:32, HJ YAN a écrit :

 Dear R-help,

 My dataset (which is a data frame, called 'Calender' here)  includes 365
 rows representing 365 days for a year.  One column ('Season')contains
 factor data representing seasons, e.g. spring, summer, autumn and winter.
 Another column (called 'Day') contains data representing wether the day
  is
 a working day  (I use 'Wd' for short here)or weekend (I use 'Wkend' for
 short here).


 I want to seperate the index of the working days  and weekends for each
 season. I used R commend which before for one criteria, for example, if
 I
 use...


 WdIndex-which(Calender$Day=='Wd')

 that will gives a set of indeices of working days in the year.

 I wonder in R could I use a combination of something such as 'AND' , 'OR'
 (e.g. in MySQL) to set 'multi-criteria'  when selecting data. So for
 example...

 WinterWdIndex-which(Calender$Day=='Wd' AND Calender$Season==Winter)


 I know the above syntax is wrong, and I checked '?which' which did not
 give
 me an answer and also tried '?AND' but seems it doesn`t exist at all...


 Many thanks!
 HJ

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting a data.frame to excel using sqlSave - adds a character ' to values

2012-03-27 Thread jim holtman
I don't see any problem here; there is no data and no indication as to
the actual problem you are having that is causing a Find  Replace.  I
export to Excel all the time and don't have any problems.  So provide
some data and an indication of the problem.

On Tue, Mar 27, 2012 at 4:37 AM, Juliette Fabre juliette_fa...@yahoo.fr wrote:
 Hello Tal,

 I have the same problem with the ' added to all my cells when exported into
 Excel.

 I can drop them manually but only one by one (the Find  Replace does not
 work) ... So finally the exported Excel file can actually not be used by
 scientists to draw graphs or whatever!

 Did you find a solution to this problem ?

 Thanks,

 Juliette


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Exporting-a-data-frame-to-excel-using-sqlSave-adds-a-character-to-values-tp1016523p4508239.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standard error terms from gfcure

2012-03-27 Thread Bonnett, Laura
Dear R-help,

I am using R 2.14.1 on Windows 7 with the 'gfcure' package (cure rate model).
I have included the treatment variable in the cure part of the model as shown 
below:


Ø  ref_treat - 
gfcure(Surv(rem.Remtime,rem.Rcens)~1,~1+strata(drpa)+factor(treat(delcure)),data=delcure,dist=loglogistic)

From that I can obtain the coefficients, standard errors etc as per 
alternative models (with covariates only fitted to the survival part of the 
model say).

 summary(ref_treat)

However, only one standard error is output:

Log-logistic mixture model

The maximum loglikelihood is -927.0449

Terms in the accelerated failure time model:
Coefficients  Std.err  z-score   p-value
Log(scale) -0.894528   0.0236 -37.8324 0.000
(Intercept) 6.929351   0.0151 460.4157 0.000

Terms in the logistic model:
Coefficients  Std.err  z-score   p-value
(Intercept) 2.542726
strata(drpa)drpa=2 18.76
factor(treat(delcure))2 0.184192
factor(treat(delcure))3 0.472809
factor(treat(delcure))4 0.255565 953.6876   0.0003 0.9997862
factor(treat(delcure))5 0.401713
Warning message:
In sqrt(diag(solve(object$infomat))) : NaNs produced


Can anyone explain why this is the case?

Very many thanks,
Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv and field containing single quotes

2012-03-27 Thread Benilton Carvalho
Thanks Henrique...

giving it a try now, but it'll take a good while, given the file size.

Cheers,
b

On 27 March 2012 02:35, Henrique Dallazuanna www...@gmail.com wrote:

 Benilton,

 Try this:

 read.table(textConnection(gsub(',', ',', gsub('^\|\$', ',
 readLines('../teste.csv', sep = ',', quote = ', header = TRUE)

 On Mon, Mar 26, 2012 at 8:09 PM, Benilton Carvalho
 beniltoncarva...@gmail.com wrote:
  I need to read in csv files, created by 3rd party, with fields
  containing single quotes (as shown below).
 
  header1,header2,header3,header4
  field1r1,field2r1,field3r1,field4r1
  field1r2,field2r2,field3r2PartA), field3r2PartB Very
 Long,field4r2
  field1r3,field2r3,field3r3,field4r3
 
 
  read.csv(filename, quote=\', header=TRUE) won't read the file
  represented above, unless the 3rd line has Very  (double quotes)
  instead of Very (single quotes)... and this is documented (scan() man
  page).
 
  Assuming that the creation of such csv files is something I'm not in a
  position to interfere with, are there (preferably, all in R)
  suggestions on how to handle such task?
 
  For the moment, I'm using my poor man's solution (below), but any
  tricks that would simplify this task would be great.
 
  Thank you very much,
 
  benilton
 
 
  parser - function(fname, header=TRUE, stringsAsFactors=FALSE){
 txt - readLines(fname)
 txt - gsub(^\|\$, , txt)
 txt - strsplit(txt, \,\)
 txt - do.call(rbind, lapply(txt, function(x) gsub(\, \\, x)))
 if (header){
 nms - txt[1,]
 txt - txt[-1,]
 }
 txt - as.data.frame(txt, stringsAsFactors=stringsAsFactors)
 if (header) names(txt) - nms
 txt
  }
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv and field containing single quotes

2012-03-27 Thread Rainer M Krug
On 27/03/12 01:09, Benilton Carvalho wrote:
 I need to read in csv files, created by 3rd party, with fields
 containing single quotes (as shown below).
 
 header1,header2,header3,header4
 field1r1,field2r1,field3r1,field4r1
 field1r2,field2r2,field3r2PartA), field3r2PartB Very Long,field4r2
 field1r3,field2r3,field3r3,field4r3

You could try under your OS, to

1) replace , with ', (assuming that the csv does not contain any'
2) read into R with sep=\'

If the file is huge, some in OS solution would be the best.

Cheers,

Rainer


 
 
 read.csv(filename, quote=\', header=TRUE) won't read the file
 represented above, unless the 3rd line has Very  (double quotes)
 instead of Very (single quotes)... and this is documented (scan() man
 page).
 
 Assuming that the creation of such csv files is something I'm not in a
 position to interfere with, are there (preferably, all in R)
 suggestions on how to handle such task?
 
 For the moment, I'm using my poor man's solution (below), but any
 tricks that would simplify this task would be great.
 
 Thank you very much,
 
 benilton
 
 
 parser - function(fname, header=TRUE, stringsAsFactors=FALSE){
 txt - readLines(fname)
 txt - gsub(^\|\$, , txt)
 txt - strsplit(txt, \,\)
 txt - do.call(rbind, lapply(txt, function(x) gsub(\, \\, x)))
 if (header){
 nms - txt[1,]
 txt - txt[-1,]
 }
 txt - as.data.frame(txt, stringsAsFactors=stringsAsFactors)
 if (header) names(txt) - nms
 txt
 }
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normalization of multi-value string variable

2012-03-27 Thread Alekseiy Beloshitskiy
Right,
I was also thinking about it, but since I have few thousands of unique words I 
'm not quite sure how it will work

I just posted my question with more detailed description here:
http://stats.stackexchange.com/questions/25355/multi-value-categorical-attributes-how-r

Really interesting case :)

Thank you,
-Alex

From: Jessica Streicher [j.streic...@micromata.de]
Sent: 27 March 2012 15:24
To: Alekseiy Beloshitskiy
Cc: r-help@r-project.org
Subject: Re: [R] normalization of multi-value string variable

Hm.. so what you need is either

- one new feature for each activity that has a binary value
e.g.:
cust_id , cycling, swimming, cooking
1001 , 1  , 0, 1

- one new feature that has a value corresponding to a certain combination of 
activities
so if you had just the three activities you would have 2^3 possible values
I'm not sure how useful that would be though for the classification.

(Would need to think about how to compute this, i'm new to R as well. Would 
probably just iterate over the data)

If you make one feature per activity, and you end up having too many to 
properly compute the svm, you might try to reduce it by other methods, PCA 
comes to mind for example, though i never used that on binary data before.


Am 27.03.2012 um 11:34 schrieb Alekseiy Beloshitskiy:

Thank you so much, Jessica,

The specific of my case is that I have a very detailed variable 'Interests' 
which may have several thousands of possible values. Usually each customer has 
3-10 different interests. For example:
customer_id|...|interests
1001   |...| cycling, swimming, cooking
1002   |...| cooking, singing, dancing

Total number of possible distinct values is several thousands. I m curious how 
to use these interests in SVM (represent as a vector of real numbers with 
several thousands of elements?).

If you have any ideas please let me know.


Thank you,
-Alex


From: Jessica Streicher 
[j.streic...@micromata.demailto:j.streic...@micromata.de]
Sent: 27 March 2012 11:18
To: Alekseiy Beloshitskiy
Subject: Re: [R] normalization of multi-value string variable

Well, not sure what you mean with scaling and normalizing strings, but if you 
want to represent the interests as numbers, you can do something like this:

n-seq(1,length(unique(my_strings)))[factor(my_strings)]


Am 26.03.2012 um 18:50 schrieb Alekseiy Beloshitskiy:

Hi All,

I need to normalize/scale string variable which represents interests of 
customers (e.g., 'cycling, rollerblading, swimming' etc).

Does anybody know how to do this, I want then use it along with other numeric 
variables for SVM classification.

Appreciate for any advice.

-Alex

[[alternative HTML version deleted]]

__
R-help@r-project.orgmailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Velti anti-spam filter: Click 
herehttps://www.mailcontrol.com/sr/r0FnbR2LtoLTndxI!oX7UvIItv2OGGpT0AcqlhvMu8o1Dzu7YBkufzUjcExl8H5fIQg52m9U+4B6aunJTqVygQ==
 to report this email as spam.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RSqlite UPDATE command problem

2012-03-27 Thread Thomas Adams
All:

I am using RSqlite and want to be able to update individual values in a
record, such as with this simple example:

library(RSQLite)
drv-dbDriver(SQLite)
con-dbConnect(drv,test.db)
my.data-data.frame(countries=c(US,UK,Canada,Australia,NewZealand),vals=c(52,36,74,10,98))
dbWriteTable(con,testtable,my.data)
q-dbReadTable(con,testtable)
q

   countries vals
1 US   52
2 UK   36
3 Canada   74
4  Australia   10
5 NewZealand   98

So, say, I want to change the value for NewZealand to '21' from '98'

I've tried something like this:

sql-UPDATE testtable SET vals=21 WHERE countries='NewZealand'
dbBeginTransaction(con)
dbGetPreparedQuery(con,sql) == I get an error here
dbCommit(con)

using a different example for an INSERT command using a data frame 'data',
this construct is accepted:

dbGetPreparedQuery(con,sql,bind.data=data)

What do I need to do differently to use the UPDATE command?

Regards,
Tom


-- 

Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:  thomas.ad...@noaa.gov
VOICE:  937-383-0528
FAX:937-383-0033

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normalization of multi-value string variable

2012-03-27 Thread Jessica Streicher
Hm.. so what you need is either

- one new feature for each activity that has a binary value
e.g.:
cust_id , cycling, swimming, cooking
1001 , 1  , 0, 1

- one new feature that has a value corresponding to a certain combination of 
activities
so if you had just the three activities you would have 2^3 possible values
I'm not sure how useful that would be though for the classification.

(Would need to think about how to compute this, i'm new to R as well. Would 
probably just iterate over the data)

If you make one feature per activity, and you end up having too many to 
properly compute the svm, you might try to reduce it by other methods, PCA 
comes to mind for example, though i never used that on binary data before.


Am 27.03.2012 um 11:34 schrieb Alekseiy Beloshitskiy:

 Thank you so much, Jessica,
 
 The specific of my case is that I have a very detailed variable 'Interests' 
 which may have several thousands of possible values. Usually each customer 
 has 3-10 different interests. For example:
 customer_id|...|interests
 1001   |...| cycling, swimming, cooking
 1002   |...| cooking, singing, dancing
 
 Total number of possible distinct values is several thousands. I m curious 
 how to use these interests in SVM (represent as a vector of real numbers with 
 several thousands of elements?).
 
 If you have any ideas please let me know.
 
 
 Thank you,
 -Alex
 
 From: Jessica Streicher [j.streic...@micromata.de]
 Sent: 27 March 2012 11:18
 To: Alekseiy Beloshitskiy
 Subject: Re: [R] normalization of multi-value string variable
 
 Well, not sure what you mean with scaling and normalizing strings, but if you 
 want to represent the interests as numbers, you can do something like this:
 
 n-seq(1,length(unique(my_strings)))[factor(my_strings)]
 
 
 Am 26.03.2012 um 18:50 schrieb Alekseiy Beloshitskiy:
 
 Hi All,
 
 I need to normalize/scale string variable which represents interests of 
 customers (e.g., 'cycling, rollerblading, swimming' etc).
 
 Does anybody know how to do this, I want then use it along with other 
 numeric variables for SVM classification.
 
 Appreciate for any advice.
 
 -Alex
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 Velti anti-spam filter: Click here to report this email as spam.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Supperscript, subscript and double lines in the main/sub title and using greek letters

2012-03-27 Thread HJ YAN
Dear R-help,

 I am trying to express myself as best as I can here. If you also use Latex
to edit math reports or other languages with similar editing method,
you'll see what I'm talking about. My sincere appologies if my question is
not clear enough to some extend, as also I'm not able to provide my code
here because I don`t know which one I can use...

When editing the title in R plots, such as using 'plot', or 'xyplot' in
'lattic', what method do you use to write greek letters and make use of
superscript and subscript, e.g. to write mathematical expressions like
using Latex:

\sigma^2
\tau^{2s}
\mu_i
\pi_{2s}

Also I would like to learn how to make two lines in the main title or sub
title if the text I need it too long for putting in a single line, e.g. are
there some R code/syntax allowing me to do something like in Latex to make
two lines in the title, for example using '//' or '\\' to seperate the two
parts of the text I want to put in two lines??

I heard about using something like

plot(x,y, main=expression())

but from neither '?plot' or '?expression' could I find comprehensive
information about what I need...

Many thanks!
HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data indexing issue...

2012-03-27 Thread HJ YAN
Hi Jim!

Thank you so much for the very helpful hints!!
I am learning 'split' now and it seems very useful..

HJ

On Tue, Mar 27, 2012 at 12:58 PM, jim holtman jholt...@gmail.com wrote:

 Why not use 'split' and get all the groups at once:

 result - split(Calandra, list(Calandra$Day, Calandra$Season, drop = TRUE)

 On Tue, Mar 27, 2012 at 7:43 AM, Ivan Calandra
 ivan.calan...@u-bourgogne.fr wrote:
  Hi HJ,
 
  Take a look at ?; this is probably what you're looking for.
 
  What you could also do is:
  Calender[Calender$Day=='Wd'  Calender$Season==Winter, ]  # notice the
  last comma
 
  This will subset directly without using which(); it might be helpful to
 you.
 
  HTH,
  Ivan
 
  --
  Ivan CALANDRA
  Université de Bourgogne
  UMR CNRS/uB 6282 Biogéosciences
  6 Boulevard Gabriel
  21000 Dijon, FRANCE
  +33(0)3.80.39.63.06
  ivan.calan...@u-bourgogne.fr
  http://biogeosciences.u-bourgogne.fr/calandra
 
 
  Le 27/03/12 12:32, HJ YAN a écrit :
 
  Dear R-help,
 
  My dataset (which is a data frame, called 'Calender' here)  includes 365
  rows representing 365 days for a year.  One column ('Season')contains
  factor data representing seasons, e.g. spring, summer, autumn and
 winter.
  Another column (called 'Day') contains data representing wether the day
   is
  a working day  (I use 'Wd' for short here)or weekend (I use 'Wkend' for
  short here).
 
 
  I want to seperate the index of the working days  and weekends for each
  season. I used R commend which before for one criteria, for example,
 if
  I
  use...
 
 
  WdIndex-which(Calender$Day=='Wd')
 
  that will gives a set of indeices of working days in the year.
 
  I wonder in R could I use a combination of something such as 'AND' ,
 'OR'
  (e.g. in MySQL) to set 'multi-criteria'  when selecting data. So for
  example...
 
  WinterWdIndex-which(Calender$Day=='Wd' AND Calender$Season==Winter)
 
 
  I know the above syntax is wrong, and I checked '?which' which did not
  give
  me an answer and also tried '?AND' but seems it doesn`t exist at all...
 
 
  Many thanks!
  HJ
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] row, col function but for a list (probably very easy question, cannot seem to find it though)

2012-03-27 Thread MBoersma
Thanks guys for all the replies.

It is an urban myth that using 'apply' functions will deliver better   
performance than 'for' loops. It may even worsen performance or create   
obstacles when it is improperly used with dataframes. Most of the   
benefits come from improving readability and maintainability.

This is what I had to learn the hard way: apply functions made it go
slower :) I do understand them much better now, also in the light of some of
these ways of using them.

In the end my program became much faster by making the data frames matrices,
and even more by finally seeing the light (courtesy of a colleague for
getting me to think in the right direction) and making much more of it into
a matrix operation. I'm very happy with the results :).

So consider me helped!

Regards,
Mark

--
View this message in context: 
http://r.789695.n4.nabble.com/row-col-function-but-for-a-list-probably-very-easy-question-cannot-seem-to-find-it-though-tp4504216p4508816.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RSqlite UPDATE command problem

2012-03-27 Thread Benilton Carvalho
You probably want:

sql-UPDATE testtable SET vals=21 WHERE countries='NewZealand'
dbGetQuery(con, sql)

instead...

b

On 27 March 2012 14:18, Thomas Adams thomas.ad...@noaa.gov wrote:

 All:

 I am using RSqlite and want to be able to update individual values in a
 record, such as with this simple example:

 library(RSQLite)
 drv-dbDriver(SQLite)
 con-dbConnect(drv,test.db)

 my.data-data.frame(countries=c(US,UK,Canada,Australia,NewZealand),vals=c(52,36,74,10,98))
 dbWriteTable(con,testtable,my.data)
 q-dbReadTable(con,testtable)
 q

   countries vals
 1 US   52
 2 UK   36
 3 Canada   74
 4  Australia   10
 5 NewZealand   98

 So, say, I want to change the value for NewZealand to '21' from '98'

 I've tried something like this:

 sql-UPDATE testtable SET vals=21 WHERE countries='NewZealand'
 dbBeginTransaction(con)
 dbGetPreparedQuery(con,sql) == I get an error here
 dbCommit(con)

 using a different example for an INSERT command using a data frame 'data',
 this construct is accepted:

 dbGetPreparedQuery(con,sql,bind.data=data)

 What do I need to do differently to use the UPDATE command?

 Regards,
 Tom


 --

 Thomas E Adams
 National Weather Service
 Ohio River Forecast Center
 1901 South State Route 134
 Wilmington, OH 45177

 EMAIL:  thomas.ad...@noaa.gov
 VOICE:  937-383-0528
 FAX:937-383-0033

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Supperscript, subscript and double lines in the main/subtitle and using greekletters

2012-03-27 Thread Gerrit Eichner

Hi, HJ,

see

?plotmath

 Hth  --  Gerrit

-
Dr. Gerrit Eichner   Mathematical Institute, Room 212
gerrit.eich...@math.uni-giessen.de   Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104  Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner
-

On Tue, 27 Mar 2012, HJ YAN wrote:


Dear R-help,

I am trying to express myself as best as I can here. If you also use Latex
to edit math reports or other languages with similar editing method,
you'll see what I'm talking about. My sincere appologies if my question is
not clear enough to some extend, as also I'm not able to provide my code
here because I don`t know which one I can use...

When editing the title in R plots, such as using 'plot', or 'xyplot' in
'lattic', what method do you use to write greek letters and make use of
superscript and subscript, e.g. to write mathematical expressions like
using Latex:

\sigma^2
\tau^{2s}
\mu_i
\pi_{2s}

Also I would like to learn how to make two lines in the main title or sub
title if the text I need it too long for putting in a single line, e.g. are
there some R code/syntax allowing me to do something like in Latex to make
two lines in the title, for example using '//' or '\\' to seperate the two
parts of the text I want to put in two lines??

I heard about using something like

plot(x,y, main=expression())

but from neither '?plot' or '?expression' could I find comprehensive
information about what I need...

Many thanks!
HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RSqlite UPDATE command problem

2012-03-27 Thread Thomas Adams
Benilton,
*
*
*Thank you — you are quite right!!*
*
*
*Regards,*
*Tom
*
On Tue, Mar 27, 2012 at 9:35 AM, Benilton Carvalho 
beniltoncarva...@gmail.com wrote:

 You probably want:

 sql-UPDATE testtable SET vals=21 WHERE countries='NewZealand'
 dbGetQuery(con, sql)

 instead...

 b

 On 27 March 2012 14:18, Thomas Adams thomas.ad...@noaa.gov wrote:

 All:

 I am using RSqlite and want to be able to update individual values in a
 record, such as with this simple example:

 library(RSQLite)
 drv-dbDriver(SQLite)
 con-dbConnect(drv,test.db)

 my.data-data.frame(countries=c(US,UK,Canada,Australia,NewZealand),vals=c(52,36,74,10,98))
 dbWriteTable(con,testtable,my.data)
 q-dbReadTable(con,testtable)
 q

   countries vals
 1 US   52
 2 UK   36
 3 Canada   74
 4  Australia   10
 5 NewZealand   98

 So, say, I want to change the value for NewZealand to '21' from '98'

 I've tried something like this:

 sql-UPDATE testtable SET vals=21 WHERE countries='NewZealand'
 dbBeginTransaction(con)
 dbGetPreparedQuery(con,sql) == I get an error here
 dbCommit(con)

 using a different example for an INSERT command using a data frame 'data',
 this construct is accepted:

 dbGetPreparedQuery(con,sql,bind.data=data)

 What do I need to do differently to use the UPDATE command?

 Regards,
 Tom


 --

 Thomas E Adams
 National Weather Service
 Ohio River Forecast Center
 1901 South State Route 134
 Wilmington, OH 45177

 EMAIL:  thomas.ad...@noaa.gov
 VOICE:  937-383-0528
 FAX:937-383-0033

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 

Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:  thomas.ad...@noaa.gov
VOICE:  937-383-0528
FAX:937-383-0033

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] copy the columns based on the code

2012-03-27 Thread Igor Sosa Mayor
:)

yes! I agree!

On Mon, Mar 26, 2012 at 10:51:17AM -0700, Bert Gunter wrote:
 Fortunes candidate?!
 -- Bert
 
 On Mon, Mar 26, 2012 at 10:24 AM, Sarah Goslee sarah.gos...@gmail.com wrote:
  The OP wrote
  The problem is that it gives the result that I want.
 
 Sarah's reply:  That's a new sort of problem.
 
 
 
 
 -- 
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
:: Igor Sosa Mayor :: joseleopoldo1...@gmail.com ::
:: GnuPG: 0x1C1E2890   :: http://www.gnupg.org/  ::
:: jabberid: rogorido  ::::


pgpB12B850AAx.pgp
Description: PGP signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help in replacing for llop

2012-03-27 Thread R. Michael Weylandt
No idea what a mean median histogram is but you may wish to check
out ?tapply or library(plyr), both of which are designed for this
split-apply-combine paradigm.

Michael

On Tue, Mar 27, 2012 at 12:51 AM, arunkumar akpbond...@gmail.com wrote:
 Hi

 I have records like like this

 X1      X2      State
 34      72      state1
 9       63      state1
 49      31      state1
 60      34      state1
 80      73      state1
 60      20      state2
 59      87      state2
 88      20      state2
 71      66      state2
 65      56      state2
 59      16      state1
 60      100     state2


 I want to get the summarize value like mean median histogram for X1 and X2
 based on state. I'm using FOR loop for this.  Is there any method to remove
 for loop and use apply or any alternatives


 -
 Thanks in Advance
        Arun
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/help-in-replacing-for-llop-tp4507939p4507939.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] I can't open a .nc file with the cdfcont function of the clim.pact package

2012-03-27 Thread anne-laure
Hello,

I am new at using R.

I would like to use the following functions of the clim.pact package:
ncdfcont and retrieve.nc

I have installed the package clim.pact in Rstudio.
I have downloaded the ncdf pack from unicar (including ncdump and ncgen).
The ncdf file I'm working on is called essai2.nc

Here is what I get, when I type the command ncinfo - cdfcont(essai2.nc)

ncinfo cdfcont.txt' renvoie un statut 1
2: In min(nchar(str)) : aucun argument trouvé pour min ; Inf est renvoyé

I'm sorry it's in French!
If I try to translate:
Error in 1:nc: the argument has null length
Information message:
1: executing the command 'C:...' gives status 1
2: In min(nchar(str)) :no argument found for min ; Inf is sent back

Could someone please help me with this?

PS: I can open the document with the function open.ncdf of the ncdf package.

Regards

--
View this message in context: 
http://r.789695.n4.nabble.com/I-can-t-open-a-nc-file-with-the-cdfcont-function-of-the-clim-pact-package-tp4508950p4508950.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] two lmer questions - formula with related variables and output interpretation

2012-03-27 Thread Dragonwalker
Hello,
I have been attempting to set up a lme and have looked at numerous posts
including 'R's lmer cheat-sheet' as well as reading a number of papers and
other resources including R help, but I am still a little confused on how to
write my model (I thought I had it).

I have asked a number of questions on different forums; most of which have
been resolved.

My main concern right now is whether my model is correct. I studied broods
of precocial chicks and watched each chick every other day for five minutes
if possible. As chicks on the same day are completely non-independent the
mean was found for each brood for each day. Variables that were recorded
were the behaviours during that time and the habitats used.

There were seven broods. Three at one site and four at the other site. Only
one site had a brood that consistently used mudflats rather than oceanfront
habitats. As none of the data within a brood is truly independent, along
with the very small number of broods, it became impossible to use
conventional statistics to test the hypotheses and so it was suggested that
mixed-effects models would be the best option as it would not only allow for
all data to be used with a random effect of Brood ID to negate the
pseudo-replication but also let me look at partial use of mudflats in one of
the other broods that only used it periodically.

So, for this part of the analysis I would like to see which factors affect
the amount of time feeding. I set up a global model with ten fixed variables
plus (1|Brood). Site, tide.h.l, tide.inc.out, MF.vs.OF, Human Disturbance
Rate (HDr), Human Disturbance proportion of time(HDp), non-Human Disturbance
(two variables as for Human Disturbance) and Age and mean.foraging.rate. As
so:

gm1-lmer(Feeding~Site+tide.level+MF.vs.OF+HDr+HDp+NHDr+NHDp+Age+mean.for.rate+(1|Brood),
data=AllBrood, REML=TRUE)

I wished to put all the factors together to explore which ones really did
influence the time spent feeding and used 'dredge' command to run all
possible combinations and then averaged the models with an AICc Delta2. I
was expecting that the proportion of time being disturbed (HDp and NHDp)
would be the most relevant as by default the greater time in other
behaviours the less time for feeding. However, MF.vs.OF had a larger effect
than HDp and NHDp but this may be because MF observations did not experience
HDp at all so this may push the effect of this habitat. Surprisingly
non-human disturbance rates rather than time had a greater effect (but these
are quite even among habitats.

The results of the model.avg are as follows:
 Estimate Std. Error z value Pr(|z|)
(Intercept)   102.7190 5.5300  18.575   2e-16 ***
HDr-1.5495 0.3451   4.490 7.11e-06 ***
MF.vs.OF2  -7.6780 3.7507   2.047  0.04065 *  
NHDp   -0.5145 0.2909   1.769  0.07695 .  
NHDr   -1.4164 0.4663   3.037  0.00239 ** 
Site2   6.1477 2.7400   2.244  0.02485 *  
tide.h.l2  -7.2546 2.6914   2.695  0.00703 ** 
tide.inc.out2  -5.8486 2.6187   2.233  0.02553 *  
HDp-0.3773 0.2732   1.381  0.16731
mean.for.rate  -0.3966 0.3220   1.232  0.21807
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Full model-averaged coefficients (with shrinkage): 
 (Intercept)HDr  MF.vs.OF2   NHDp   NHDr  Site2 
tide.h.l2 tide.inc.out2HDp
  102.718962  -1.549499  -5.734171  -0.239550  -1.416373   5.336532 
-7.254627 -5.848553  -0.044795
 mean.for.rate
 -0.081734

Relative variable importance:
  (Intercept)   Age   HDp   HDr mean.for.rate 
MF.vs.OF  NHDp  NHDr 
 1.00  0.00  0.12  1.00  0.21 
0.75  0.47  1.00 
 Site  tide.h.l  tide.inc.out 
 0.87  1.00  1.00 

I was wondering whether there would be a better way to formulate the model
to allow for this effect, or could I just keep it as is and just infer that
it may be partly affected by the amount of disturbance within these habitats
but as it has a greater effect that other factors are at play which would
then lead me onto the next model which is going to explore observations that
do not include disturbance which would allow me to tease the natural factors
affecting feeding behaviour? I was going to run this second model with site
still as a fixed effect and then run it with (1|Site) to remove site effect
(if one is found).

I would prefer to keep it simple as I really want to use a lme, but don't
have the understanding for more complex interactions.

I has also asked a question, which is yet to be answered on stats stack
exchange, in regards to the output of the model.avg.  as follows:

I have seen the Estimates described as the effect of the variable and this
is discussed in results sections as an important value to report (in regards
to the size of them and their direction (+ve/-ve). (the paper I 

[R] Zero inflated GAMM

2012-03-27 Thread Bert Harris
HI all,

I am planning to get Zuur et al.'s new book when it comes out, but until
then I was wondering if anyone could suggest examples of zero inflated or
hurdle GAMMs. I have count data with many zeros, non-linear relationships,
and site as a random effect.

Thank you!
Bert Harris, University of Adelaide

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Utilization on R

2012-03-27 Thread Kurinji Pandiyan
Thank you for the modified script! I have now tried on different datasets
and it works very well and is dramatically faster than my original script!

I really appreciate the help.
Kurinji

On Fri, Mar 23, 2012 at 1:33 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Taking a look at your script: there are a some potential optimizations
 you can do:

  # Fine
 poi - as.character(top.GSM396290) #5000 characters
 x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables

 # Pre-allocate the space
 x - vector(list, 485577) # x - list()

 # Do the a stuff once outside the loop so you aren't doing it 485577
 times
 a - strsplit(as.character(x.data[, UCSC_REFGENE_NAME]), ;)

 # Lets use an apply statement instead of a for loop
 # vapply is the fastest since we prespecify the return type.
 x.data[vapply(a, function(x) any(poi %in% x), logical(1)), ]

 I think this will do what you wanted (and hopefully much faster)

 Note that you could probably tune this further but I think this
 strikes a good balance between clarity and performance (for now)

 Hope this helps,

 Michael

 On Fri, Mar 23, 2012 at 11:52 AM, Kurinji Pandiyan
 kurinji.pandi...@gmail.com wrote:
 
  Thank you for the input.
 
  As it were, I realized that my script is utilizing a lot more memory than
  I claimed - it was initially using 3 GB but has gone up to 20.24 active
 but
  29.63 assigned to the R session.
 
  The script has run overnight and now I don't think it is active anymore
  since I keep getting the error message that I am out of startup disk
 space
  for application memory.
 
  I am attaching screen shots of my RAM usage distribution (given that
 there
  is no fluctuation in the usage by the R session I believe it is not
 running
  anymore) and of my available HD.
 
 
 
 
 
  Here is my script -
 
  poi - as.character(top.GSM396290) #5000 characters
  x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables
  head(x.data)
 
  x - list()
 
  for(i in 1:485577){
   a - as.character(x.data[i, UCSC_REFGENE_NAME])
   a - unlist(strsplit(a, ;))
   if(any(poi %in% a) == TRUE) {x[[i]] - x.data[i,]}
}
 
   # this step completed in a few hours
 
  x - do.call(rbind, x) # this step has been running overnight and is
 still
  stuck
 
  Thanks, I really appreciate the help.
  Kurinji
 
  On Thu, Mar 22, 2012 at 10:44 PM, R. Michael Weylandt
  michael.weyla...@gmail.com wrote:
 
  Well... what makes you think you are hitting memory constraints then?
  If you have significantly less than 3GB of data, it shouldn't surprise
  you if R never needs more than 3GB of memory.
 
  You could just be running your scripts inefficiently...it's an extreme
  example, but all the memory and gigaflopping in the world can't speed
  this up (by much):
 
  for(i in seq_len(1e6)) Sys.sleep(10)
 
  Perhaps you should look into profiling tools or parallel
  computation...if you can post a representative example of your
  scripts, we might be able to give performance pointers.
 
  Michael
 
  On Fri, Mar 23, 2012 at 1:33 AM, Kurinji Pandiyan
  kurinji.pandi...@gmail.com wrote:
   Yes, I am.
  
   Thank you,
   Kurinji
  
   On Mar 22, 2012, at 10:27 PM, R. Michael Weylandt
   michael.weyla...@gmail.com wrote:
  
   Use 64bit R?
  
   Michael
  
   On Thu, Mar 22, 2012 at 5:22 PM, Kurinji Pandiyan
   kurinji.pandi...@gmail.com wrote:
   Hello,
  
   I have a 32 GB RAM Mac Pro with a 2*2.4 GHz quad core processor and
   2TB
   storage. Despite this having so much memory, I am not able to get R
   to
   utilize much more than 3 GBs. Some of my scripts take hours to run
   but I
   would think they would be much faster if more memory is utilized.
 How
   do I
   optimize the memory usage on R by my Mac Pro?
  
   Thank you!
   Kurinji
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using MuMIn - error message

2012-03-27 Thread Dragonwalker
Hello Mike,

I don't think I did, but I fixed the issue by loading each package before
use. The second issue was solved by removing a variable that was used to
create two other categorical variables. I think it must have been
recognising this.

Thanks for the help.

--
View this message in context: 
http://r.789695.n4.nabble.com/Using-MuMIn-error-message-tp4500236p4508901.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix(unlist(strsplit())) 'missing value' issue

2012-03-27 Thread MaartenJacobs
*I'm still a R noob, just had a couple of lectures about it in our research
master.

There is a Deal or no deal experiment where I have to write some code for.
Someone wrote a website to gather the data and write it in a .xlsx file.
These are seperate files for seperate participants so first I have to import
the seperate datafiles. I do that like this:
# Merge the xlsx files into one dataframe
alldata - rbind(read.xlsx('experimentdata.xlsx',1), 
 read.xlsx('experimentdata_1.xlsx',1),
 read.xlsx('experimentdata_2.xlsx',1)
#etc..#read.xlsx('filepath',1)
 )

The website is poorly written and some of the variables are not conveniant.
I have the variables 'bankoffer.1', 'bankoffer.3', 'bankoffer.5' etc.
These variables look like the following:
alldata$bankoffer.1
[1] 246000:accepted267000:notaccepted 20:notaccepted
Levels: 246000:accepted 267000:notaccepted 20:notaccepted

 alldata$bankoffer.3
[1] 999429000:notaccepted 48000:notaccepted 
Levels: 999 429000:notaccepted 48000:notaccepted
The problem is that the values in the cells are weird, they constitude for
example of /'246000:accepted'/I would decompose that so that /246000 /is in
one variable and /accepted /in another

no problem just do this:
 as.data.frame(matrix(unlist(strsplit(as.character(alldata$bankoffer.1),:)),
 ncol = 2, byrow = TRUE))
  V1  V2
1 246000accepted
2 267000 notaccepted
3 20 notaccepted

However when there are missing values, like in bankoffer.3, there is a
problem

 as.data.frame(matrix(unlist(strsplit(as.character(alldata$bankoffer.3),:)),
 ncol = 2, byrow = TRUE))
   V1  V2
1 999  429000
2 notaccepted   48000
3 notaccepted 999
Warning message:
In matrix(unlist(strsplit(as.character(alldata$bankoffer.3), :)),  :
  data length [5] is not a sub-multiple or multiple of the number of rows
[3]

R does not encounter a ':' in the 999 and therefor places the 429000 in
the second colomn, this should however be in the first one. Like this:
   V1  V2
1 999  999
2  429000 notaccepted   
3 48000  notaccepted 

How can I tell R to place 999 in both colomns when he/she encounters a
999. Or any other solotion to my problem is also good. I for example
thought about making R add ':999' whenever it encounters 999 as a
sort of a workaround for the problem but I have no idea how to do that.

I hope I made it a little clear what the problem is and what I eventually
want. If not please ask.

Greetings Maarten

--
View this message in context: 
http://r.789695.n4.nabble.com/matrix-unlist-strsplit-missing-value-issue-tp4509065p4509065.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting a data.frame to excel using sqlSave - adds a character ' to values

2012-03-27 Thread Juliette Fabre
Hello, 

I encountered a situation similar as the one described by Tal above :

I use the RODBC library to export multiple dataframes into different sheets
of an Excel file.
My dataframes contain Character, Date and Numeric columns.

library(RODBC)
channel - odbcConnectExcel(xls.file = myXlsFile, readOnly = FALSE)
sqlSave(channel, data, tablename = Table1, rownames = F, colnames = T)
odbcClose(channel)

When exported into Excel, *all * of my cells start with the ' character
(which is different from Tal's situation where *only * non-numeric cells
started with ' character).
I need the columns that contain numeric data or dates to be imported into
the appropriate format so that they can be manipulated (graphics etc).

I found a macro that formats all the sheets in the appropriate way, but I
would like to discover why even my numeric data (type Numeric in R)  are
imported as text.

Regards, 

Juliette




--
View this message in context: 
http://r.789695.n4.nabble.com/Exporting-a-data-frame-to-excel-using-sqlSave-adds-a-character-to-values-tp1016523p4509108.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] row, col function but for a list (probably very easy question, cannot seem to find it though)

2012-03-27 Thread David Winsemius


On Mar 27, 2012, at 3:37 AM, peter dalgaard wrote:



On Mar 26, 2012, at 17:33 , David Winsemius wrote:


The usual approach to that problem is to use sapply:

x - list()
x - sapply(1:10, function(z) x[[z]] - 1:z )


Yikes!

If that works, it is only by coincidence (The pre-assignment to  
x only serves the purpose of allowing the [[-assignment inside the  
anonymous function, but the assignment is to a local copy which is  
deleted on exit, and the return value is the rhs of the assignment.)


Well, maybe not by pure coincidence. There are really two rhs's and it  
was because of the outer assignment of the values to 'x' that it  
worked as intended. My error is in propagating the notion that  
assignments to named objects inside the function will survive outside  
the function.


 x - list(); y-list()
 y - sapply(1:10, function(z) x[[z]] - 1:z )
 x
list()




Please:

x - lapply(1:10, function(z) 1:z)

or even

x - lapply(1:10, seq_len)


Yes, I see the error of my ways. I wonder how many times I have been  
in this state of sin in the past?




--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R extract parts

2012-03-27 Thread MSousa

Good Afternoon, 

I believe that my to the problem, the R has a more effective solution.
in place the use the loop 
  I have the following set of data, and needs to extract some sections.


user poscommunications source v_destine
7   1   109   2222
7   2   100   2222
7   3   214   2222
7   4   322   2222
7   5  69920 22   161
7   6   68  16197
7   7  196   9797
7   8   427   9722
7   9460   2222
7  10   307   2222
7  11  9582   2222
7  12   55428   2222
7  139192   2222
7  14  19   2222

my idea is to arise when a value greater than 1000 communications able to
extract some data.
In the example data set, is valued at over 1000 in the position 11,12,13.  
my idea is to get results like this:
user, sector, source, destine, count, average
7 1  22  22 4  186.25 #
(109+100+214+322)
7 2  161   97  1  68
7 2  97   97  1  196
7 2  97   22  1  427
7 2  22   22  2  383



--
View this message in context: 
http://r.789695.n4.nabble.com/R-extract-parts-tp4509042p4509042.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constructing Distance matrix for hclust

2012-03-27 Thread Vinod Hegde
Hi,

I have similarity value between string pairs in a mysql database.
I need to construct the distance matrix which hclust can take and cluster
the strings. Most of the examples I came across show how to construct the
distance matrix using dist function.

How can I code to construct distance matrix using the data in mysql db.

Thanks a lot for any help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lasso constraint

2012-03-27 Thread yx78
In the package lasso2, there is a Prostate Data. To find coefficients in the
prostate cancer example we could impose L1 constraint on the parameters. 

code is: 
data(Prostate) 
 p.mean - apply(Prostate, 5,mean) 
 pros - sweep(Prostate, 5, p.mean, -) 
 p.std - apply(pros, 5, var) 
 pros - sweep(pros, 5, sqrt(p.std),/) 
 pros[, lpsa] - Prostate[, lpsa] 
l1ce(lpsa ~  . , pros, bound = 0.44) 

I can't figure out what dose 0.44 come from. On the paper it said it was
from  generalized cross-validation and it is the optimal choice. 

paper name: Regression Shrinkage and Selection via the Lasso 

author: Robert Tibshirani 



--
View this message in context: 
http://r.789695.n4.nabble.com/lasso-constraint-tp4508998p4508998.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Supperscript, subscript and double lines in the main/subtitle and using greekletters

2012-03-27 Thread David Winsemius


On Mar 27, 2012, at 9:39 AM, Gerrit Eichner wrote:


Hi, HJ,

see

?plotmath

Hth  --  Gerrit

-
Dr. Gerrit Eichner   Mathematical Institute, Room 212

On Tue, 27 Mar 2012, HJ YAN wrote:


Dear R-help,

I am trying to express myself as best as I can here. If you also  
use Latex

to edit math reports or other languages with similar editing method,
you'll see what I'm talking about. My sincere appologies if my  
question is
not clear enough to some extend, as also I'm not able to provide my  
code

here because I don`t know which one I can use...

When editing the title in R plots, such as using 'plot', or  
'xyplot' in
'lattic', what method do you use to write greek letters and make  
use of
superscript and subscript, e.g. to write mathematical expressions  
like

using Latex:

\sigma^2
\tau^{2s}
\mu_i
\pi_{2s}

Also I would like to learn how to make two lines in the main title  
or sub
title if the text I need it too long for putting in a single line,  
e.g. are
there some R code/syntax allowing me to do something like in Latex  
to make
two lines in the title, for example using '//' or '\\' to seperate  
the two

parts of the text I want to put in two lines??

I heard about using something like

plot(x,y, main=expression())

but from neither '?plot' or '?expression' could I find comprehensive
information about what I need...


The plotmath environment (not the correct term) will not accept the  
usual EOL \n marker for new lines. You can cobble together a  
subsitute (at least for the two line problem) using the plotmath  
`atop` function.


plot(1,1, main=expression(atop(  laaahhh~tau,  
bllleeehhh~epsilon)))


Notice the need for a plotmath connector such as ~ or * between  
the text and the unquoted greeks.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test for the difference of means in population, please help

2012-03-27 Thread Greg Snow
You should use mixed effects modeling to analyze data of this sort.
This is not a topic that has generally been covered by introductory
classes, so you should consult with a professional statistician on
your problem, or educate yourself well beyond the novice level (this
takes more than just reading 1 book, a few classes would be good to
get to this level, or intense study of several books).

Since everything is balanced nicely, you could average over the 4
repeats and use a 2 sample t test (assuming the assumptions hold, your
sample data would be fine) comparing the 2 sets of 400 means.  This
will test for a general difference in the overall means, but ignores
other information and hypotheses that may be important (which is why
the mixed effects model approach is much preferred).

On Tue, Mar 27, 2012 at 1:13 AM, ali_protocol
mohammadianalimohammad...@gmail.com wrote:
 Dear all,

 Novice in statistics.

 I have 2 experimental conditions. Each condition has ~400 points as its
 response. Each condition is done in 4 repereats (so I have 2 x 400 x 4
 points).

 I want to compare the means of two conditions and test whether they are same
 or not. Which test should I use?

 #populations
 c = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)
 b = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)

 #means of repeats
 c.mean= apply (c,2, mean)
 b.mean= apply (b,2,mean)

 #mean of experiment
 c.mean.all= mean (c)
 b.mean.all= mean (b)

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-test-for-the-difference-of-means-in-population-please-help-tp4508089p4508089.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Supperscript, subscript and double lines in the main/subtitle and using greekletters

2012-03-27 Thread mlell08
The title() function also has parameter 'line' where you can specify the
margin line in which the text should be displayed.
How many lines of margin should be around the figure region of the plot
can be specified before plotting by par(mar=c(bottom,left,top,right)),
in text lines. margin lines are also used by par(mgp=...) or mtext()

Regards!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Utilization on R

2012-03-27 Thread Alekseiy Beloshitskiy
Guys, let me add my 5 coins into your interesting discussion.

I have ~10Gb txt file with train data for my model. It has about 150 millions 
rows for 12 variables.
When I load it into memory (just run only one row!):

train-read.table(file=/training.txt)

while loading it takes ~28Gb of RAM (It takes about 2hours to finish), and when 
data are loaded, rsession takes ~14Gb.
 I even can't imagine how much it will take when I will run svm train on this 
data set. Is there any optimization to decrease time required for loading data 
into memory.
I use 32RAM x64 box.

Thank you,
-Alex


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Kurinji Pandiyan [kurinji.pandi...@gmail.com]
Sent: 27 March 2012 18:14
To: R. Michael Weylandt
Cc: r-help@r-project.org
Subject: Re: [R] Memory Utilization on R

Thank you for the modified script! I have now tried on different datasets
and it works very well and is dramatically faster than my original script!

I really appreciate the help.
Kurinji

On Fri, Mar 23, 2012 at 1:33 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Taking a look at your script: there are a some potential optimizations
 you can do:

  # Fine
 poi - as.character(top.GSM396290) #5000 characters
 x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables

 # Pre-allocate the space
 x - vector(list, 485577) # x - list()

 # Do the a stuff once outside the loop so you aren't doing it 485577
 times
 a - strsplit(as.character(x.data[, UCSC_REFGENE_NAME]), ;)

 # Lets use an apply statement instead of a for loop
 # vapply is the fastest since we prespecify the return type.
 x.data[vapply(a, function(x) any(poi %in% x), logical(1)), ]

 I think this will do what you wanted (and hopefully much faster)

 Note that you could probably tune this further but I think this
 strikes a good balance between clarity and performance (for now)

 Hope this helps,

 Michael

 On Fri, Mar 23, 2012 at 11:52 AM, Kurinji Pandiyan
 kurinji.pandi...@gmail.com wrote:
 
  Thank you for the input.
 
  As it were, I realized that my script is utilizing a lot more memory than
  I claimed - it was initially using 3 GB but has gone up to 20.24 active
 but
  29.63 assigned to the R session.
 
  The script has run overnight and now I don't think it is active anymore
  since I keep getting the error message that I am out of startup disk
 space
  for application memory.
 
  I am attaching screen shots of my RAM usage distribution (given that
 there
  is no fluctuation in the usage by the R session I believe it is not
 running
  anymore) and of my available HD.
 
 
 
 
 
  Here is my script -
 
  poi - as.character(top.GSM396290) #5000 characters
  x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables
  head(x.data)
 
  x - list()
 
  for(i in 1:485577){
   a - as.character(x.data[i, UCSC_REFGENE_NAME])
   a - unlist(strsplit(a, ;))
   if(any(poi %in% a) == TRUE) {x[[i]] - x.data[i,]}
}
 
   # this step completed in a few hours
 
  x - do.call(rbind, x) # this step has been running overnight and is
 still
  stuck
 
  Thanks, I really appreciate the help.
  Kurinji
 
  On Thu, Mar 22, 2012 at 10:44 PM, R. Michael Weylandt
  michael.weyla...@gmail.com wrote:
 
  Well... what makes you think you are hitting memory constraints then?
  If you have significantly less than 3GB of data, it shouldn't surprise
  you if R never needs more than 3GB of memory.
 
  You could just be running your scripts inefficiently...it's an extreme
  example, but all the memory and gigaflopping in the world can't speed
  this up (by much):
 
  for(i in seq_len(1e6)) Sys.sleep(10)
 
  Perhaps you should look into profiling tools or parallel
  computation...if you can post a representative example of your
  scripts, we might be able to give performance pointers.
 
  Michael
 
  On Fri, Mar 23, 2012 at 1:33 AM, Kurinji Pandiyan
  kurinji.pandi...@gmail.com wrote:
   Yes, I am.
  
   Thank you,
   Kurinji
  
   On Mar 22, 2012, at 10:27 PM, R. Michael Weylandt
   michael.weyla...@gmail.com wrote:
  
   Use 64bit R?
  
   Michael
  
   On Thu, Mar 22, 2012 at 5:22 PM, Kurinji Pandiyan
   kurinji.pandi...@gmail.com wrote:
   Hello,
  
   I have a 32 GB RAM Mac Pro with a 2*2.4 GHz quad core processor and
   2TB
   storage. Despite this having so much memory, I am not able to get R
   to
   utilize much more than 3 GBs. Some of my scripts take hours to run
   but I
   would think they would be much faster if more memory is utilized.
 How
   do I
   optimize the memory usage on R by my Mac Pro?
  
   Thank you!
   Kurinji
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 


[[alternative HTML version 

Re: [R] two lmer questions - formula with related variables and output interpretation

2012-03-27 Thread Dragonwalker
I realised that I removed the link to the question but forgot to remove the
text regarding it. Sorry. I am not sure if I am supposed to link to other
forums, but I can add the links as needed (as the format is clearer).

I actually have one more question though in regards to which data to use.
If it is better to just report the estimates and CIs then should I use those
with shrinkage instead, and if so, does anyone know how I can get the CIs
for these rather than just the regular CIs. I apologise if I am asking too
many questions within one post.

Rachel

--
View this message in context: 
http://r.789695.n4.nabble.com/two-lmer-questions-formula-with-related-variables-and-output-interpretation-tp4508876p4509334.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Utilization on R

2012-03-27 Thread R. Michael Weylandt
Note that you can actually drop the line defining the big list x. I
thought it would be needed, but it turns out to be unnecessary after
cleaning up the second half: cutting off that allocation might save
you even more time.

Best,
Michael

On Tue, Mar 27, 2012 at 11:14 AM, Kurinji Pandiyan
kurinji.pandi...@gmail.com wrote:
 Thank you for the modified script! I have now tried on different datasets
 and it works very well and is dramatically faster than my original script!

 I really appreciate the help.
 Kurinji

 On Fri, Mar 23, 2012 at 1:33 PM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:

 Taking a look at your script: there are a some potential optimizations
 you can do:

  # Fine
 poi - as.character(top.GSM396290) #5000 characters
 x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables

 # Pre-allocate the space
 x - vector(list, 485577) # x - list()

 # Do the a stuff once outside the loop so you aren't doing it 485577
 times
 a - strsplit(as.character(x.data[, UCSC_REFGENE_NAME]), ;)

 # Lets use an apply statement instead of a for loop
 # vapply is the fastest since we prespecify the return type.
 x.data[vapply(a, function(x) any(poi %in% x), logical(1)), ]

 I think this will do what you wanted (and hopefully much faster)

 Note that you could probably tune this further but I think this
 strikes a good balance between clarity and performance (for now)

 Hope this helps,

 Michael

 On Fri, Mar 23, 2012 at 11:52 AM, Kurinji Pandiyan
 kurinji.pandi...@gmail.com wrote:
 
  Thank you for the input.
 
  As it were, I realized that my script is utilizing a lot more memory
  than
  I claimed - it was initially using 3 GB but has gone up to 20.24 active
  but
  29.63 assigned to the R session.
 
  The script has run overnight and now I don't think it is active anymore
  since I keep getting the error message that I am out of startup disk
  space
  for application memory.
 
  I am attaching screen shots of my RAM usage distribution (given that
  there
  is no fluctuation in the usage by the R session I believe it is not
  running
  anymore) and of my available HD.
 
 
 
 
 
  Here is my script -
 
  poi - as.character(top.GSM396290) #5000 characters
  x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables
  head(x.data)
 
  x - list()
 
  for(i in 1:485577){
   a - as.character(x.data[i, UCSC_REFGENE_NAME])
   a - unlist(strsplit(a, ;))
   if(any(poi %in% a) == TRUE) {x[[i]] - x.data[i,]}
    }
 
   # this step completed in a few hours
 
  x - do.call(rbind, x) # this step has been running overnight and is
  still
  stuck
 
  Thanks, I really appreciate the help.
  Kurinji
 
  On Thu, Mar 22, 2012 at 10:44 PM, R. Michael Weylandt
  michael.weyla...@gmail.com wrote:
 
  Well... what makes you think you are hitting memory constraints then?
  If you have significantly less than 3GB of data, it shouldn't surprise
  you if R never needs more than 3GB of memory.
 
  You could just be running your scripts inefficiently...it's an extreme
  example, but all the memory and gigaflopping in the world can't speed
  this up (by much):
 
  for(i in seq_len(1e6)) Sys.sleep(10)
 
  Perhaps you should look into profiling tools or parallel
  computation...if you can post a representative example of your
  scripts, we might be able to give performance pointers.
 
  Michael
 
  On Fri, Mar 23, 2012 at 1:33 AM, Kurinji Pandiyan
  kurinji.pandi...@gmail.com wrote:
   Yes, I am.
  
   Thank you,
   Kurinji
  
   On Mar 22, 2012, at 10:27 PM, R. Michael Weylandt
   michael.weyla...@gmail.com wrote:
  
   Use 64bit R?
  
   Michael
  
   On Thu, Mar 22, 2012 at 5:22 PM, Kurinji Pandiyan
   kurinji.pandi...@gmail.com wrote:
   Hello,
  
   I have a 32 GB RAM Mac Pro with a 2*2.4 GHz quad core processor and
   2TB
   storage. Despite this having so much memory, I am not able to get R
   to
   utilize much more than 3 GBs. Some of my scripts take hours to run
   but I
   would think they would be much faster if more memory is utilized.
   How
   do I
   optimize the memory usage on R by my Mac Pro?
  
   Thank you!
   Kurinji
  
          [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] copy the columns based on the code

2012-03-27 Thread MSousa


Hello,

this code, works perfectly 
   temp - merge(travel, city, by.x=Source, by.y=cod)
   result - merge(temp, city, by.x=Destine, by.y=cod) 

The problem was the construction of the data frame, had a parenthesis in
city-rbind(city,data.frame(city=Lisbon,cod=3))), 

I tried to delete the post, but i don't could.
  As I have little experience in R, I still do some mistakes.
I use read.table to load the data frame, the way in the post, it was quickly
that  i found to describe the problem.
  The forum has been a great help for me.

Thanks







--
View this message in context: 
http://r.789695.n4.nabble.com/copy-the-columns-based-on-the-code-tp4505253p4509340.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] copy the columns based on the code

2012-03-27 Thread jim holtman
yet another way:

 city-data.frame(city=Barcelona,cod=1)
 city-rbind(city,data.frame(city=Madrid,cod=2))
 city-rbind(city,data.frame(city=Lisbon,cod=3))
 city-rbind(city,data.frame(city=Milan,cod=4))
 city-rbind(city,data.frame(city=London,cod=5))

 travel-data.frame(pos=1,Source=1,Destine=2)
 travel-rbind(travel,data.frame(pos=1,Source=1,Destine=3))
 travel-rbind(travel,data.frame(pos=2,Source=3,Destine=4))
 travel-rbind(travel,data.frame(pos=3,Source=2,Destine=4))
 travel-rbind(travel,data.frame(pos=4,Source=1,Destine=3))

 travel$city - city$city[match(travel$Source, city$cod)]
 travel$city_destine - city$city[match(travel$Destine, city$cod)]

 travel
  pos Source Destine  city city_destine
1   1  1   2 Barcelona   Madrid
2   1  1   3 Barcelona   Lisbon
3   2  3   4LisbonMilan
4   3  2   4MadridMilan
5   4  1   3 Barcelona   Lisbon



On Tue, Mar 27, 2012 at 12:15 PM, MSousa ricardosousa2...@clix.pt wrote:


 Hello,

 this code, works perfectly
   temp - merge(travel, city, by.x=Source, by.y=cod)
   result - merge(temp, city, by.x=Destine, by.y=cod)

 The problem was the construction of the data frame, had a parenthesis in
 city-rbind(city,data.frame(city=Lisbon,cod=3))),

 I tried to delete the post, but i don't could.
  As I have little experience in R, I still do some mistakes.
 I use read.table to load the data frame, the way in the post, it was quickly
 that  i found to describe the problem.
  The forum has been a great help for me.

 Thanks







 --
 View this message in context: 
 http://r.789695.n4.nabble.com/copy-the-columns-based-on-the-code-tp4505253p4509340.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot of function seems to cut off near edge of domain

2012-03-27 Thread Chad Mills
Ah, thanks.  I am new to R and was unaware of the from/to parameters for
the plot function.  I thought xlim and ylim served that purpose.  Thanks
again!

-Chad

On Tue, Mar 27, 2012 at 3:31 AM, Matthieu Dubois matth...@gmail.com wrote:

 Dear Chad,

 your problem is linked to (1) the function returning NaNs from x values
 greater than 50, and (2) the fact that the function is estimated on a
 predefined number of points.

 Calling plot for a function object is basically a wrapper for curve(). Your
 function g() is evaluated on the whole xlim domain, which will return NaN
 values for x50 (Try g(60) ). In addition, curve() splits the x interval
 (here
 from 0 to 60) into a predifined number of points (n=101 is the default, see
 help(curve)) at which the function is estimated. In your code, the
 function is
 estimated at values x - seq(0, 60, length=101), and g(x) that are not NaN
 are
 plotted. The largest x value (from the sequence) that doesn't return a NaN
 is
 max(x[!is.nan(g(x))]), which is 49.8.

 One way to solve it is to explicitly specify the domain used to estimate
 the
 function, by using the from and to arguments that are passed to curve():

 #Figure 2, with xlim beyond the radius of the circle
 plot(g,axes=F,from=0, to =50, xlim=c(0, 60), ylim=c(0,60))
 axis(1,pos=0)
 axis(2,pos=0)

 HTH

 Matthieu

 Matthieu Dubois
 Post-doctoral researcher
 Psychology Department
 Université Libre de Bruxelles

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Utilization on R

2012-03-27 Thread R. Michael Weylandt
It's really not suggested etiquette to thread-jack, but generally, the
more you can tell to read.table (particularly the colClasses, nrows,
as.is, and stringsAsFactors arguments) the faster it will be able to
read things by skipping various necessary checks.

Michael

On Tue, Mar 27, 2012 at 12:07 PM, Alekseiy Beloshitskiy
abeloshits...@velti.com wrote:
 Guys, let me add my 5 coins into your interesting discussion.

 I have ~10Gb txt file with train data for my model. It has about 150 millions 
 rows for 12 variables.
 When I load it into memory (just run only one row!):

 train-read.table(file=/training.txt)

 while loading it takes ~28Gb of RAM (It takes about 2hours to finish), and 
 when data are loaded, rsession takes ~14Gb.
  I even can't imagine how much it will take when I will run svm train on this 
 data set. Is there any optimization to decrease time required for loading 
 data into memory.
 I use 32RAM x64 box.

 Thank you,
 -Alex

 
 From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf 
 of Kurinji Pandiyan [kurinji.pandi...@gmail.com]
 Sent: 27 March 2012 18:14
 To: R. Michael Weylandt
 Cc: r-help@r-project.org
 Subject: Re: [R] Memory Utilization on R

 Thank you for the modified script! I have now tried on different datasets
 and it works very well and is dramatically faster than my original script!

 I really appreciate the help.
 Kurinji

 On Fri, Mar 23, 2012 at 1:33 PM, R. Michael Weylandt 
 michael.weyla...@gmail.com wrote:

 Taking a look at your script: there are a some potential optimizations
 you can do:

  # Fine
 poi - as.character(top.GSM396290) #5000 characters
 x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables

 # Pre-allocate the space
 x - vector(list, 485577) # x - list()

 # Do the a stuff once outside the loop so you aren't doing it 485577
 times
 a - strsplit(as.character(x.data[, UCSC_REFGENE_NAME]), ;)

 # Lets use an apply statement instead of a for loop
 # vapply is the fastest since we prespecify the return type.
 x.data[vapply(a, function(x) any(poi %in% x), logical(1)), ]

 I think this will do what you wanted (and hopefully much faster)

 Note that you could probably tune this further but I think this
 strikes a good balance between clarity and performance (for now)

 Hope this helps,

 Michael

 On Fri, Mar 23, 2012 at 11:52 AM, Kurinji Pandiyan
 kurinji.pandi...@gmail.com wrote:
 
  Thank you for the input.
 
  As it were, I realized that my script is utilizing a lot more memory than
  I claimed - it was initially using 3 GB but has gone up to 20.24 active
 but
  29.63 assigned to the R session.
 
  The script has run overnight and now I don't think it is active anymore
  since I keep getting the error message that I am out of startup disk
 space
  for application memory.
 
  I am attaching screen shots of my RAM usage distribution (given that
 there
  is no fluctuation in the usage by the R session I believe it is not
 running
  anymore) and of my available HD.
 
 
 
 
 
  Here is my script -
 
  poi - as.character(top.GSM396290) #5000 characters
  x.data - h1[,c(1,7:9)] # 485577 obs of 4 variables
  head(x.data)
 
  x - list()
 
  for(i in 1:485577){
   a - as.character(x.data[i, UCSC_REFGENE_NAME])
   a - unlist(strsplit(a, ;))
   if(any(poi %in% a) == TRUE) {x[[i]] - x.data[i,]}
    }
 
   # this step completed in a few hours
 
  x - do.call(rbind, x) # this step has been running overnight and is
 still
  stuck
 
  Thanks, I really appreciate the help.
  Kurinji
 
  On Thu, Mar 22, 2012 at 10:44 PM, R. Michael Weylandt
  michael.weyla...@gmail.com wrote:
 
  Well... what makes you think you are hitting memory constraints then?
  If you have significantly less than 3GB of data, it shouldn't surprise
  you if R never needs more than 3GB of memory.
 
  You could just be running your scripts inefficiently...it's an extreme
  example, but all the memory and gigaflopping in the world can't speed
  this up (by much):
 
  for(i in seq_len(1e6)) Sys.sleep(10)
 
  Perhaps you should look into profiling tools or parallel
  computation...if you can post a representative example of your
  scripts, we might be able to give performance pointers.
 
  Michael
 
  On Fri, Mar 23, 2012 at 1:33 AM, Kurinji Pandiyan
  kurinji.pandi...@gmail.com wrote:
   Yes, I am.
  
   Thank you,
   Kurinji
  
   On Mar 22, 2012, at 10:27 PM, R. Michael Weylandt
   michael.weyla...@gmail.com wrote:
  
   Use 64bit R?
  
   Michael
  
   On Thu, Mar 22, 2012 at 5:22 PM, Kurinji Pandiyan
   kurinji.pandi...@gmail.com wrote:
   Hello,
  
   I have a 32 GB RAM Mac Pro with a 2*2.4 GHz quad core processor and
   2TB
   storage. Despite this having so much memory, I am not able to get R
   to
   utilize much more than 3 GBs. Some of my scripts take hours to run
   but I
   would think they would be much faster if more memory is utilized.
 How
   do I
   optimize the memory usage on R by my Mac Pro?
  
   Thank you!
   Kurinji
  
      

[R] What error distribution should I use?

2012-03-27 Thread Lívia Dorneles Audino
I'm trying to make a glmm to identify the relationship between insect
species richness with fragment size, isolation and time (different years).
I already tried to analyse it using poisson distribution error, but I
always face with the following warning:
*glm.fit: fitted probabilities numerically 0 or 1 occurred *

This is probably hapenning because my dataset has a lot of zeros. So, what
error distribution should I use?

-- 
*Lívia *

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ignore error getting next result

2012-03-27 Thread C Lin

Dear All,
 
How do I ignore an error and still getting result of next iteration.
I am trying to do wilcox.test on a loop, when the test fail, I would like to 
continue doing the next iteration and getting the p-value.
I tried to do tryCatch or try but I cannot retrieve the p-value if the test is 
not fail.
 
sample code:

test2=list(numeric(0),c(10,20));
test1=list(c(1),c(1,2,3,4));
for (i in 1:2){
 wtest=wilcox.test(test1[[i]],test2[[i]])
}
 
i=1 will fail, I want to ignore this and get the pvalue for i=2.
 
Thanks,
Lin   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lasso constraint

2012-03-27 Thread Weidong Gu
Hi,

your code has errors: apply function only has 1 or 2 as margin.

bound is used as turning parameter for summation of absolute
coefficients. lasso runs on a grid of the turning parameter for
varying strength of shrinkage. so each turning value may yield
different sets of coefficients and values. cross validation is used to
estimate the value of the turning parameter which gives the smallest
errors (mse or deviance) on testing data.

Weidong Gu



On Tue, Mar 27, 2012 at 10:35 AM, yx78 yangx...@gmail.com wrote:
 In the package lasso2, there is a Prostate Data. To find coefficients in the
 prostate cancer example we could impose L1 constraint on the parameters.

 code is:
 data(Prostate)
  p.mean - apply(Prostate, 5,mean)
  pros - sweep(Prostate, 5, p.mean, -)
  p.std - apply(pros, 5, var)
  pros - sweep(pros, 5, sqrt(p.std),/)
  pros[, lpsa] - Prostate[, lpsa]
 l1ce(lpsa ~  . , pros, bound = 0.44)

 I can't figure out what dose 0.44 come from. On the paper it said it was
 from  generalized cross-validation and it is the optimal choice.

 paper name: Regression Shrinkage and Selection via the Lasso

 author: Robert Tibshirani



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/lasso-constraint-tp4508998p4508998.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.octave fails with data from Octave 3.2.X

2012-03-27 Thread Helios de Rosario
Hi,

I'm afraid that the function read.octave from package foreign has
some problems with the ASCII data format exported by new versions of
Octave (later than 3.2.X). It fails even for a simple case as:

[Octave code:]
octave:1 x=1;
octave:2 save -ascii testdata.mat x

[Now in R:]
 octavedata - read.octave('testdata.mat')
Mensajes de aviso perdidos
In read_octave_unknown(con, type) : cannot handle unknown type ''

In this simple case I guess that the problem is that new versions
Octave append two blank lines after each variable, and this confuses the
current implementation of read.octave()

The problem is worse if the saved variables include other types as
structs, or strings. The new syntax of the MAT files is not recognized
by read.octave().

Of course, it's always difficult to keep this kind of functions working
when the external program changes its specification for saving
variables, but if would be nice if the maintainers of foreign could at
least solve the issue of blank lines. That way, it would still be
possible to import simple data types as scalars and matrices.

Otherwise, I suppose that a workaround is saving the data in binary
(matlab) format, then load it with Octave 3.2.X, and save it in text
format from that version.

 sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] foreign_0.8-49




-- 
Helios de Rosario Martínez
 
 Researcher


INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lasso constraint

2012-03-27 Thread Bert Gunter
Inline:

On Tue, Mar 27, 2012 at 10:00 AM, Weidong Gu anopheles...@gmail.com wrote:
 Hi,

 your code has errors: apply function only has 1 or 2 as margin.

FALSE.  Please re-read the Help files. It works as expected with
arbitrary higher dim arrays.

-- Bert



 bound is used as turning parameter for summation of absolute
 coefficients. lasso runs on a grid of the turning parameter for
 varying strength of shrinkage. so each turning value may yield
 different sets of coefficients and values. cross validation is used to
 estimate the value of the turning parameter which gives the smallest
 errors (mse or deviance) on testing data.

 Weidong Gu



 On Tue, Mar 27, 2012 at 10:35 AM, yx78 yangx...@gmail.com wrote:
 In the package lasso2, there is a Prostate Data. To find coefficients in the
 prostate cancer example we could impose L1 constraint on the parameters.

 code is:
 data(Prostate)
  p.mean - apply(Prostate, 5,mean)
  pros - sweep(Prostate, 5, p.mean, -)
  p.std - apply(pros, 5, var)
  pros - sweep(pros, 5, sqrt(p.std),/)
  pros[, lpsa] - Prostate[, lpsa]
 l1ce(lpsa ~  . , pros, bound = 0.44)

 I can't figure out what dose 0.44 come from. On the paper it said it was
 from  generalized cross-validation and it is the optimal choice.

 paper name: Regression Shrinkage and Selection via the Lasso

 author: Robert Tibshirani



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/lasso-constraint-tp4508998p4508998.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Supperscript, subscript and double lines in the main/subtitle and using greekletters

2012-03-27 Thread HJ YAN
Sorry last message was not completed before sending
Please below

On Tue, Mar 27, 2012 at 5:36 PM, HJ YAN yhj...@googlemail.com wrote:

 Thank you very much Gerrit, for the nice hints!

 Just done some more googling and reaserches on this and trying to
 answering it myself...

 Below is the code that works for double lines (adopted from Gerrit's
 hints) and some of the formats (e.g. 1 and 3, but not 2 and 4) listed below:

 (1) \sigma^2
 (2) \tau^{2s}
 (3) \mu_i
 (4) \pi_{2s}

 plot(1:3, ylab = expression(Superscript in greek letters ( * mu^2 ~ m))
, xlab = expression(Subscript in greek letters ~ mu[2]* ~ pi)
   , main = expression(atop(Happy Easter ,to all R-Helpers)))


 For using greek letters, am still a bit confused when needing a
 * though...e.g. seems it needs a * in front of greek letter
 expressions, when applying 'expression (...)'. And a * seems not
 required when a greek letter is needed outside the double quotations, e.g.

when applying just 'expression(...)'.  Again, a * is needed when making
subscript as shown above...

It seems ~ is reserved for making spaces before/between greek letters.
What if we need ~ in the title as ~ is a standard notation in
statistics when expressing is from when writing down a distribution, e.g.
'X~N(0,1)'...

HJ

















 On Tue, Mar 27, 2012 at 2:39 PM, Gerrit Eichner 
 gerrit.eich...@math.uni-giessen.de wrote:

 Hi, HJ,

 see

 ?plotmath

  Hth  --  Gerrit

 --**--**-
 Dr. Gerrit Eichner   Mathematical Institute, Room 212
 gerrit.eich...@math.uni-**giessen.de gerrit.eich...@math.uni-giessen.de  
 Justus-Liebig-University Giessen
 Tel: +49-(0)641-99-32104  Arndtstr. 2, 35392 Giessen, Germany
 Fax: +49-(0)641-99-32109
 http://www.uni-giessen.de/cms/**eichnerhttp://www.uni-giessen.de/cms/eichner
 --**--**-



 On Tue, 27 Mar 2012, HJ YAN wrote:

  Dear R-help,

 I am trying to express myself as best as I can here. If you also use
 Latex
 to edit math reports or other languages with similar editing method,
 you'll see what I'm talking about. My sincere appologies if my question
 is
 not clear enough to some extend, as also I'm not able to provide my code
 here because I don`t know which one I can use...

 When editing the title in R plots, such as using 'plot', or 'xyplot' in
 'lattic', what method do you use to write greek letters and make use of
 superscript and subscript, e.g. to write mathematical expressions like
 using Latex:

 \sigma^2
 \tau^{2s}
 \mu_i
 \pi_{2s}

 Also I would like to learn how to make two lines in the main title or sub
 title if the text I need it too long for putting in a single line, e.g.
 are
 there some R code/syntax allowing me to do something like in Latex to
 make
 two lines in the title, for example using '//' or '\\' to seperate the
 two
 parts of the text I want to put in two lines??

 I heard about using something like

 plot(x,y, main=expression())

 but from neither '?plot' or '?expression' could I find comprehensive
 information about what I need...

 Many thanks!
 HJ

[[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rgdal package - get information

2012-03-27 Thread julio cesar oliveira

 Hi,

 I used
 GDALinfo(MOD13Q1.A2001049.h13v11.005.2007002215512.250m_16_days_EVI.tif)  
 and
 got the results:

 rows10
 columns 11
 bands   1
 origin.x150701.4
 origin.y7744897
 res.x   250
 res.y   250
 ysign   -1
 oblique.x   0
 oblique.y   0
 driver  GTiff
 projection  +proj=utm +zone=23 +south +datum=WGS84 +units=m +no_defs
 file
  /MOD13Q1.A2001049.h13v11.005.2007002215512.250m_16_days_EVI.tif
 apparent band summary:
   *GDType*   Bmin  Bmax Bmean Bsd hasNoDataValue NoDataValue
 1  *Int16* -32768 32767 0   0  FALSE   0
 Metadata:
 AREA_OR_POINT=Point
 TIFFTAG_SOFTWARE=MODIS Reprojection Tool  v4.1 March 2009



 *How to read the information GDType?*


Thanks,

julio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] installing R 2.14.2

2012-03-27 Thread Heba S

Hello,I  am trying to install a newer version of R (R 2.14.2) from this 
linkhttp://cran.r-project.org/bin/macosx/
However I am getting an error that it can not be installed on my computer. My 
Mac is version 10.6.8. Can you please advise me what the problem. I need the 
newer version to install the ggm package.
Thanks,
Heba
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignore error getting next result

2012-03-27 Thread David Winsemius


On Mar 27, 2012, at 12:56 PM, C Lin wrote:



Dear All,

How do I ignore an error and still getting result of next iteration.
I am trying to do wilcox.test on a loop, when the test fail, I would  
like to continue doing the next iteration and getting the p-value.
I tried to do tryCatch or try but I cannot retrieve the p-value if  
the test is not fail.


sample code:

test2=list(numeric(0),c(10,20));
test1=list(c(1),c(1,2,3,4));
for (i in 1:2){
wtest=wilcox.test(test1[[i]],test2[[i]])
}

i=1 will fail, I want to ignore this and get the pvalue for i=2.


Please read the FAQ entry And you would be advise to read through the  
rest of the FAQ as well.


http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f




Thanks,
Lin 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help on predict.lm

2012-03-27 Thread Nederjaard
Hello, 

I'm new here, but will try to be as specific and complete as possible. I'm
trying to use “lm“ to first estimate parameter values from a set of
calibration measurements, and then later to use those estimates to calculate
another set of values with “predict.lm”.

First I have a calibration dataset of absorbance values measured from
standard solutions with known concentration of Bromide:

 stds
  abs conc
1 -0.00210
2  0.1003  200
3  0.2395  500
4  0.3293  800

On this small calibration series, I perform a linear regression to find the
parameter estimates of the relationship between absorbance (abs) and
concentration (conc):

 linear1 - lm(abs~conc, data=stds)
 summary(linear1)

Call:
lm(formula = abs ~ conc, data = stds)

Residuals:
1 2 3 4 
-0.012600  0.006467  0.020667 -0.014533 

Coefficients:
 Estimate Std. Error t value Pr(|t|)   
(Intercept) 1.050e-02  1.629e-02   0.645  0.58527   
conc4.167e-04  3.378e-05  12.333  0.00651 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ 
’ 1 

Residual standard error: 0.02048 on 2 degrees of freedom
Multiple R-squared: 0.987,  Adjusted R-squared: 0.9805 
F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651 





Now I come with another dataset, which contains measured absorbance values
of Bromide in solution:

 brom
hours abs
1-1.0  0.0633
2 1.0  0.2686
3 5.0  0.2446
418.0  0.2274
529.0  0.2091
642.0  0.1961
753.0  0.1310
876.0  0.1504
991.0  0.1317
10   95.5  0.1169
11  101.0  0.0977
12  115.0  0.1023
13  123.5  0.0879
14  138.5  0.0724
15  147.5  0.0564
16  163.0  0.0495
17  171.0  0.0325
18  189.0  0.0182
19  211.0  0.0047
20  212.5  NA
21  815.5 -0.2112
22  816.5 -0.1896
23  817.5 -0.0783
24  818.5  0.2963
25  819.5  0.1448
26  839.5  0.0936
27  864.0  0.0560
28  888.0  0.0310
29  960.5  0.0056
30 1009.0 -0.0163

The values in column brom$abs, measured on 30 subsequent points in time need
to be calculated to Bromide concentrations, using the previously established
relationship “linear1”.  
At first, I thought it could be done by:

 predict.lm(linear1, brom$abs)
Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one

But, R gives the above error message. Then, after some searching around on
different fora and R-communities (including this one), I learned that the
“newdata” in “predict.lm” actually needs to be coerced into a separate
dataframe. Thus:

 mabs - data.frame(Abs = brom$abs)
 predict.lm(linear1, mabs)
Error in eval(expr, envir, enclos) : object 'conc' not found

Again, R gives an error...probably because I made an error, but I truly fail
to see where. I hope somebody can explain to me clearly what I'm doing wrong
and what I should do to instead.
Any help is greatly appreciated, thanks !

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lasso constraint

2012-03-27 Thread Steve Lianoglou
Hi,

On Tue, Mar 27, 2012 at 10:35 AM, yx78 yangx...@gmail.com wrote:
 In the package lasso2, there is a Prostate Data. To find coefficients in the
 prostate cancer example we could impose L1 constraint on the parameters.

 code is:
 data(Prostate)
  p.mean - apply(Prostate, 5,mean)
  pros - sweep(Prostate, 5, p.mean, -)
  p.std - apply(pros, 5, var)
  pros - sweep(pros, 5, sqrt(p.std),/)
  pros[, lpsa] - Prostate[, lpsa]
 l1ce(lpsa ~  . , pros, bound = 0.44)

 I can't figure out what dose 0.44 come from. On the paper it said it was
 from  generalized cross-validation and it is the optimal choice.

Yes, this is exactly how the optimal value for bound would be found.

Using the lasso2 package, you'll likely have to do a grid search over
possible values for `bound` in a cross validation setting and you pick
the one that fits the model best on the held out data over all your CV
folds.

If I were you, I'd use the glmnet package since it can calculate the
entire regularization path w/o having to do a grid search over the
bound (or lamda), making cross validation easier.

If you're confused about how you might use cross validation to find
the optimal value of the parameter(s) of the model you are building,
then it's time to pull yourself away from the keyboaRd and start doing
some reading, or (as Bert will likely tell you) consult your local
statistician.

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readHTLMTable help

2012-03-27 Thread Lucas
Hello to everyone.
I´m using this function to download some information from a website.
This is the URL:
http://164.77.222.61/climatologia/php/vientoMaximo8.php?IdEstacion=330007FechaIni=01-1-1980
If you go to that website you´ll find a table with meteorological
information. One column is called Intesidad Máxima Diaria, and that is
the one i need.
I´ve been traying to extract that column, but I´m unable to do it.
First I tryed simple to download the complete table and then do some kind
of filter to extract the column but, for some reason when I call the
function
a-readHTLMTable(url), the table is downloaded in a unfriendly format and I
can not differentiate the column

If anyone could help me I´ll appreciate it.
Thank you.

Lucas.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert day of year back into a date format.

2012-03-27 Thread Sam Albers
Hello,

I am having trouble figuring out how to convert a Day of Year integer
back into a Date format. For example I have the following:

date - 
c('2008-01-01','2008-01-02','2008-01-03','2008-01-04','2008-01-05','2008-01-06','2008-01-07',
'2008-01-08','2008-01-09','2008-01-10','2008-01-11','2008-01-12','2008-01-13','2008-01-14','2008-01-15',
'2008-01-16','2008-01-17','2008-01-18','2008-01-19','2008-01-20','2008-01-21','2008-01-22','2008-01-23')

## this is then converted into a number corresponding to the day of
the year like so:

dayofyear - strptime(date, format=%Y-%m-%d)$yday + 1

## Now my question is how do I get back to a date format (obviously
omitting the year).
## The end result is that I'd like to be able to have axis labels as
something like Month-Day or just Month
## instead of just an integers which isn't always intuitive for people
but I can't seem to figure out how to tell R
## to recognize an integer as a date.

Any suggestions?

Many thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignore error getting next result

2012-03-27 Thread C Lin

As a matter of fact, I did read the FAQ. However, in the FAQ coef() is used to 
return the coefficients of lm() if it succeeded. 
I cannot find similar function for pvalue.
 

 CC: r-help@r-project.org
 From: dwinsem...@comcast.net
 To: bac...@hotmail.com
 Subject: Re: [R] ignore error getting next result
 Date: Tue, 27 Mar 2012 13:40:39 -0400
 
 
 On Mar 27, 2012, at 12:56 PM, C Lin wrote:
 
 
  Dear All,
 
  How do I ignore an error and still getting result of next iteration.
  I am trying to do wilcox.test on a loop, when the test fail, I would 
  like to continue doing the next iteration and getting the p-value.
  I tried to do tryCatch or try but I cannot retrieve the p-value if 
  the test is not fail.
 
  sample code:
 
  test2=list(numeric(0),c(10,20));
  test1=list(c(1),c(1,2,3,4));
  for (i in 1:2){
  wtest=wilcox.test(test1[[i]],test2[[i]])
  }
 
  i=1 will fail, I want to ignore this and get the pvalue for i=2.
 
 Please read the FAQ entry And you would be advise to read through the 
 rest of the FAQ as well.
 
 http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f
 
 
 
  Thanks,
  Lin 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 West Hartford, CT
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on predict.lm

2012-03-27 Thread Berend Hasselman

On 27-03-2012, at 19:24, Nederjaard wrote:

 Hello, 
 
 I'm new here, but will try to be as specific and complete as possible. I'm
 trying to use “lm“ to first estimate parameter values from a set of
 calibration measurements, and then later to use those estimates to calculate
 another set of values with “predict.lm”.
 
 First I have a calibration dataset of absorbance values measured from
 standard solutions with known concentration of Bromide:
 
 stds
  abs conc
 1 -0.00210
 2  0.1003  200
 3  0.2395  500
 4  0.3293  800
 
 On this small calibration series, I perform a linear regression to find the
 parameter estimates of the relationship between absorbance (abs) and
 concentration (conc):
 
 linear1 - lm(abs~conc, data=stds)
 summary(linear1)
 
 Call:
 lm(formula = abs ~ conc, data = stds)
 
 Residuals:
1 2 3 4 
 -0.012600  0.006467  0.020667 -0.014533 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|)   
 (Intercept) 1.050e-02  1.629e-02   0.645  0.58527   
 conc4.167e-04  3.378e-05  12.333  0.00651 **
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 
 Residual standard error: 0.02048 on 2 degrees of freedom
 Multiple R-squared: 0.987,  Adjusted R-squared: 0.9805 
 F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651 
 
 
 
 
 
 Now I come with another dataset, which contains measured absorbance values
 of Bromide in solution:
 
 brom
hours abs
 1-1.0  0.0633
 2 1.0  0.2686
 3 5.0  0.2446
 418.0  0.2274
 529.0  0.2091
 642.0  0.1961
 753.0  0.1310
 876.0  0.1504
 991.0  0.1317
 10   95.5  0.1169
 11  101.0  0.0977
 12  115.0  0.1023
 13  123.5  0.0879
 14  138.5  0.0724
 15  147.5  0.0564
 16  163.0  0.0495
 17  171.0  0.0325
 18  189.0  0.0182
 19  211.0  0.0047
 20  212.5  NA
 21  815.5 -0.2112
 22  816.5 -0.1896
 23  817.5 -0.0783
 24  818.5  0.2963
 25  819.5  0.1448
 26  839.5  0.0936
 27  864.0  0.0560
 28  888.0  0.0310
 29  960.5  0.0056
 30 1009.0 -0.0163
 
 The values in column brom$abs, measured on 30 subsequent points in time need
 to be calculated to Bromide concentrations, using the previously established
 relationship “linear1”.  
 At first, I thought it could be done by:
 
 predict.lm(linear1, brom$abs)
 Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one
 
 But, R gives the above error message. Then, after some searching around on
 different fora and R-communities (including this one), I learned that the
 “newdata” in “predict.lm” actually needs to be coerced into a separate
 dataframe. Thus:
 
 mabs - data.frame(Abs = brom$abs)
 predict.lm(linear1, mabs)
 Error in eval(expr, envir, enclos) : object 'conc' not found
 

There is no column with name conc in your dataframe mabs.

You regressed abs on conc. For prediction you need data for conc and not abs.
So provide data for conc. Or change the regression around: lm(conc ~ abs, 
data=stds) if that makes any sense.

What you did with mabs wouldn't have worked anyway because Abs is not the same 
as abs.
And it wasn't necessary.

Berend


 Again, R gives an error...probably because I made an error, but I truly fail
 to see where. I hope somebody can explain to me clearly what I'm doing wrong
 and what I should do to instead.
 Any help is greatly appreciated, thanks !
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
 Sent from the R help mailing list archive at Nabble.com.
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignore error getting next result

2012-03-27 Thread David Winsemius

On Mar 27, 2012, at 2:18 PM, C Lin wrote:

 As a matter of fact, I did read the FAQ. However, in the FAQ coef()  
 is used to return the coefficients of lm() if it succeeded.
 I cannot find similar function for pvalue.

So your question has nothing to do with the subject line? If you are  
trying to get information about the object returned by the wilcox.test  
function,  then you should be looking at the help page in the Value  
section for that function.

-- 
David.

  CC: r-help@r-project.org
  From: dwinsem...@comcast.net
  To: bac...@hotmail.com
  Subject: Re: [R] ignore error getting next result
  Date: Tue, 27 Mar 2012 13:40:39 -0400
 
 
  On Mar 27, 2012, at 12:56 PM, C Lin wrote:
 
  
   Dear All,
  
   How do I ignore an error and still getting result of next  
 iteration.
   I am trying to do wilcox.test on a loop, when the test fail, I  
 would
   like to continue doing the next iteration and getting the p-value.
   I tried to do tryCatch or try but I cannot retrieve the p-value if
   the test is not fail.
  
   sample code:
  
   test2=list(numeric(0),c(10,20));
   test1=list(c(1),c(1,2,3,4));
   for (i in 1:2){
   wtest=wilcox.test(test1[[i]],test2[[i]])
   }
  
   i=1 will fail, I want to ignore this and get the pvalue for i=2.
 
  Please read the FAQ entry And you would be advise to read through  
 the
  rest of the FAQ as well.
 
  http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f
 
 
  
   Thanks,
   Lin
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  David Winsemius, MD
  West Hartford, CT
 

David Winsemius, MD
West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert day of year back into a date format.

2012-03-27 Thread Justin Haynes
There may very well be a better solution, but this works.

format(strptime(dayofyear, format=%j), format=%m-%d)

On Tue, Mar 27, 2012 at 11:12 AM, Sam Albers tonightstheni...@gmail.comwrote:

 Hello,

 I am having trouble figuring out how to convert a Day of Year integer
 back into a Date format. For example I have the following:

 date -
 c('2008-01-01','2008-01-02','2008-01-03','2008-01-04','2008-01-05','2008-01-06','2008-01-07',

 '2008-01-08','2008-01-09','2008-01-10','2008-01-11','2008-01-12','2008-01-13','2008-01-14','2008-01-15',

 '2008-01-16','2008-01-17','2008-01-18','2008-01-19','2008-01-20','2008-01-21','2008-01-22','2008-01-23')

 ## this is then converted into a number corresponding to the day of
 the year like so:

 dayofyear - strptime(date, format=%Y-%m-%d)$yday + 1

 ## Now my question is how do I get back to a date format (obviously
 omitting the year).
 ## The end result is that I'd like to be able to have axis labels as
 something like Month-Day or just Month
 ## instead of just an integers which isn't always intuitive for people
 but I can't seem to figure out how to tell R
 ## to recognize an integer as a date.

 Any suggestions?

 Many thanks in advance!

 Sam

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing R 2.14.2

2012-03-27 Thread Steve Lianoglou
Hi,

On Tue, Mar 27, 2012 at 1:03 PM, Heba S abehsun...@hotmail.com wrote:

 Hello,I  am trying to install a newer version of R (R 2.14.2) from this 
 linkhttp://cran.r-project.org/bin/macosx/
 However I am getting an error that it can not be installed on my computer. My 
 Mac is version 10.6.8. Can you please advise me what the problem. I need the 
 newer version to install the ggm package.

If you want any meaningful help, you'll have to provide the exact
error that you're getting, so please reproduce the error message
(verbatim) in your follow up email.

Also let us know when during the installation process the error occurs.

Thanks,

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignore error getting next result

2012-03-27 Thread C Lin

I'm sorry. I do appreciate you are trying to help. However, what I am trying to 
do is not exactly the same as in FAQ.
 
If I do the following:
 
test2=list(numeric(0),c(10,20));
test1=list(c(1),c(1,2,3,4));
for (i in 1:2){
 tryCatch(wilcox.test(test1[[i]],test2[[i]]),error = function(e) NULL);
}

I cannot get the p-value of the test for i=2.

any other input? anyone?
 
Thanks,
Lin 




CC: r-help@r-project.org
From: dwinsem...@comcast.net
To: bac...@hotmail.com
Subject: Re: [R] ignore error getting next result
Date: Tue, 27 Mar 2012 14:26:40 -0400




On Mar 27, 2012, at 2:18 PM, C Lin wrote:


As a matter of fact, I did read the FAQ. However, in the FAQ coef() is used to 
return the coefficients of lm() if it succeeded. 
I cannot find similar function for pvalue.


So your question has nothing to do with the subject line? If you are trying to 
get information about the object returned by the wilcox.test function,  then 
you should be looking at the help page in the Value section for that function.


-- 
David.






 CC: r-help@r-project.org
 From: dwinsem...@comcast.net
 To: bac...@hotmail.com
 Subject: Re: [R] ignore error getting next result
 Date: Tue, 27 Mar 2012 13:40:39 -0400
 
 
 On Mar 27, 2012, at 12:56 PM, C Lin wrote:
 
 
  Dear All,
 
  How do I ignore an error and still getting result of next iteration.
  I am trying to do wilcox.test on a loop, when the test fail, I would 
  like to continue doing the next iteration and getting the p-value.
  I tried to do tryCatch or try but I cannot retrieve the p-value if 
  the test is not fail.
 
  sample code:
 
  test2=list(numeric(0),c(10,20));
  test1=list(c(1),c(1,2,3,4));
  for (i in 1:2){
  wtest=wilcox.test(test1[[i]],test2[[i]])
  }
 
  i=1 will fail, I want to ignore this and get the pvalue for i=2.
 
 Please read the FAQ entry And you would be advise to read through the 
 rest of the FAQ as well.
 
 http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f
 
 
 
  Thanks,
  Lin 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 West Hartford, CT
 





David Winsemius, MD
West Hartford, CT
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignore error getting next result

2012-03-27 Thread David Winsemius


On Mar 27, 2012, at 2:36 PM, C Lin wrote:

I'm sorry. I do appreciate you are trying to help. However, what I  
am trying to do is not exactly the same as in FAQ.


If I do the following:

test2=list(numeric(0),c(10,20));
test1=list(c(1),c(1,2,3,4));
for (i in 1:2){
 tryCatch(wilcox.test(test1[[i]],test2[[i]]),error = function(e)  
NULL);

}
I cannot get the p-value of the test for i=2.


I say again READ THE HELP PAGE FOR wilcox.test (and I even suggested  
the section where you would find the answer.)



test2=list(numeric(0),c(10,20));
test1=list(c(1),c(1,2,3,4));res - list()
for (i in 1:2){
 res - tryCatch(wilcox.test(test1[[i]],test2[[i]])$p.value, error =  
function(e) NULL);

}
res
--

David


any other input? anyone?

Thanks,
Lin
CC: r-help@r-project.org
From: dwinsem...@comcast.net
To: bac...@hotmail.com
Subject: Re: [R] ignore error getting next result
Date: Tue, 27 Mar 2012 14:26:40 -0400


On Mar 27, 2012, at 2:18 PM, C Lin wrote:

As a matter of fact, I did read the FAQ. However, in the FAQ coef()  
is used to return the coefficients of lm() if it succeeded.

I cannot find similar function for pvalue.

So your question has nothing to do with the subject line? If you are  
trying to get information about the object returned by the  
wilcox.test function,  then you should be looking at the help page  
in the Value section for that function.


--
David.

 CC: r-help@r-project.org
 From: dwinsem...@comcast.net
 To: bac...@hotmail.com
 Subject: Re: [R] ignore error getting next result
 Date: Tue, 27 Mar 2012 13:40:39 -0400


 On Mar 27, 2012, at 12:56 PM, C Lin wrote:

 
  Dear All,
 
  How do I ignore an error and still getting result of next  
iteration.
  I am trying to do wilcox.test on a loop, when the test fail, I  
would

  like to continue doing the next iteration and getting the p-value.
  I tried to do tryCatch or try but I cannot retrieve the p-value if
  the test is not fail.
 
  sample code:
 
  test2=list(numeric(0),c(10,20));
  test1=list(c(1),c(1,2,3,4));
  for (i in 1:2){
  wtest=wilcox.test(test1[[i]],test2[[i]])
  }
 
  i=1 will fail, I want to ignore this and get the pvalue for i=2.

 Please read the FAQ entry And you would be advise to read through  
the

 rest of the FAQ as well.

 
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f


 
  Thanks,
  Lin
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT


David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on predict.lm

2012-03-27 Thread Peter Ehlers


R tries hard to keep you from committing scientific abuse.
As stated, your problem seems to me akin to

1. Given that a man's age can be modelled as a function
   of the grayness of his hair,
2. predict a man's age from the temperature in Barcelona.

Your calibration relates 'abs' and 'conc'. Now you want
to predict 'abs' from _'hours'_ (I think). I suspect that
concentration is actually related to time and this is
the missing link that you'll have to provide.

BTW, I'm surprised that you didn't find the requirement
for 'newdata' to be a data frame on the predict.lm help
page - it's pretty clearly stated there.

Peter Ehlers


On 2012-03-27 10:24, Nederjaard wrote:

Hello,

I'm new here, but will try to be as specific and complete as possible. I'm
trying to use “lm“ to first estimate parameter values from a set of
calibration measurements, and then later to use those estimates to calculate
another set of values with “predict.lm”.

First I have a calibration dataset of absorbance values measured from
standard solutions with known concentration of Bromide:


stds

   abs conc
1 -0.00210
2  0.1003  200
3  0.2395  500
4  0.3293  800

On this small calibration series, I perform a linear regression to find the
parameter estimates of the relationship between absorbance (abs) and
concentration (conc):


linear1- lm(abs~conc, data=stds)
summary(linear1)


Call:
lm(formula = abs ~ conc, data = stds)

Residuals:
 1 2 3 4
-0.012600  0.006467  0.020667 -0.014533

Coefficients:
  Estimate Std. Error t value Pr(|t|)
(Intercept) 1.050e-02  1.629e-02   0.645  0.58527
conc4.167e-04  3.378e-05  12.333  0.00651 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02048 on 2 degrees of freedom
Multiple R-squared: 0.987,  Adjusted R-squared: 0.9805
F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651





Now I come with another dataset, which contains measured absorbance values
of Bromide in solution:


brom

 hours abs
1-1.0  0.0633
2 1.0  0.2686
3 5.0  0.2446
418.0  0.2274
529.0  0.2091
642.0  0.1961
753.0  0.1310
876.0  0.1504
991.0  0.1317
10   95.5  0.1169
11  101.0  0.0977
12  115.0  0.1023
13  123.5  0.0879
14  138.5  0.0724
15  147.5  0.0564
16  163.0  0.0495
17  171.0  0.0325
18  189.0  0.0182
19  211.0  0.0047
20  212.5  NA
21  815.5 -0.2112
22  816.5 -0.1896
23  817.5 -0.0783
24  818.5  0.2963
25  819.5  0.1448
26  839.5  0.0936
27  864.0  0.0560
28  888.0  0.0310
29  960.5  0.0056
30 1009.0 -0.0163

The values in column brom$abs, measured on 30 subsequent points in time need
to be calculated to Bromide concentrations, using the previously established
relationship “linear1”.
At first, I thought it could be done by:


predict.lm(linear1, brom$abs)

Error in eval(predvars, data, env) :
   numeric 'envir' arg not of length one

But, R gives the above error message. Then, after some searching around on
different fora and R-communities (including this one), I learned that the
“newdata” in “predict.lm” actually needs to be coerced into a separate
dataframe. Thus:


mabs- data.frame(Abs = brom$abs)
predict.lm(linear1, mabs)

Error in eval(expr, envir, enclos) : object 'conc' not found

Again, R gives an error...probably because I made an error, but I truly fail
to see where. I hope somebody can explain to me clearly what I'm doing wrong
and what I should do to instead.
Any help is greatly appreciated, thanks !

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SVM. How to use categorical attributes?

2012-03-27 Thread Steve Lianoglou
Hi,

On Tue, Mar 27, 2012 at 6:05 AM, Alekseiy Beloshitskiy
abeloshits...@velti.com wrote:
 Hi All,

 Here is the case. I want to build classification model (SVM). Some of 
 variables for this model are categorical attributes which represent words  
 (usually 3-10 words - query for search in google). For example:
 search_id | query_words                        |..| result
 ---+--+--+
 1            | how,to,grow,tree                  |..| 4
 2            | smartfone,htc,buy,price         |..| 7
 3            | buy,house,realty,london         |..| 6
 4            | where,to,go,weekend,cinema |..| 4
 ...
 As you can see, words in the query are disordered and may occur in different 
 queries. Total number of unique words for all queries is several thousands.
 The question is how to represent this variable (query_words) to use for SVM.

 Thank you for any advices!

One approach is to wire up a bag of words type of design matrix.

That is to say the matrix has as many columns as there are unique
words. Each row is an observation (query), and the words that appear
in the query have a value of 1 (or you can count the number of times
each word appears).

You can maybe get smarter and try to group like words together, but
... now you'll have two problems ...

Hope you have lots of data!

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on predict.lm

2012-03-27 Thread Bert Gunter
FORTUNE!!!
-- Bert

On Tue, Mar 27, 2012 at 11:44 AM, Peter Ehlers ehl...@ucalgary.ca wrote:

 R tries hard to keep you from committing scientific abuse.
 As stated, your problem seems to me akin to

 1. Given that a man's age can be modelled as a function
   of the grayness of his hair,
 2. predict a man's age from the temperature in Barcelona.


...




Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignore error getting next result

2012-03-27 Thread William Dunlap
 test2=list(numeric(0),c(10,20));
 test1=list(c(1),c(1,2,3,4));
 for (i in 1:2){
  tryCatch(wilcox.test(test1[[i]],test2[[i]]),error = function(e) NULL);
 }
 
 I cannot get the p-value of the test for i=2.

You didn't store the results of wilcox.test anywhere.

First make it work for data that does not cause errors in wilcox.test:
  f0 - function(list1, list2) {
  stopifnot(length(list1) == length(list2))
  sapply(seq_along(list1), function(i) wilcox.test(list1[[i]], 
list2[[i]])$p.value)
  }
   f0( list(1:4, 5:7), list(11:12, (4:6)+.9))
  [1] 0.133 0.700

Then add the call to tryCatch so it works when there is a problem.  I use NA 
instead
of NULL as the output of the error function so it goes into the vector of 
p.values.
Use NULL if you are returning the whole output of wilcox.test instead of just 
the p.value
component.

  f1 - function(list1, list2) {
  stopifnot(length(list1) == length(list2))
  sapply(seq_along(list1), function(i)tryCatch( wilcox.test(list1[[i]], 
list2[[i]])$p.value, error=function(e)NA_real_))
  }
   f1( list(1:4, 5:7), list(11:12, (4:6)+.9))
  [1] 0.133 0.700
   f1( list(1:4, numeric(0), 5:7), list(11:12, 17, (4:6)+.9))
  [1] 0.133NA 0.700

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of C Lin
 Sent: Tuesday, March 27, 2012 11:36 AM
 To: dwinsem...@comcast.net
 Cc: r-help@r-project.org
 Subject: Re: [R] ignore error getting next result
 
 
 I'm sorry. I do appreciate you are trying to help. However, what I am trying 
 to do is not
 exactly the same as in FAQ.
 
 If I do the following:
 
 test2=list(numeric(0),c(10,20));
 test1=list(c(1),c(1,2,3,4));
 for (i in 1:2){
  tryCatch(wilcox.test(test1[[i]],test2[[i]]),error = function(e) NULL);
 }
 
 I cannot get the p-value of the test for i=2.
 
 any other input? anyone?
 
 Thanks,
 Lin
 
 
 
 
 CC: r-help@r-project.org
 From: dwinsem...@comcast.net
 To: bac...@hotmail.com
 Subject: Re: [R] ignore error getting next result
 Date: Tue, 27 Mar 2012 14:26:40 -0400
 
 
 
 
 On Mar 27, 2012, at 2:18 PM, C Lin wrote:
 
 
 As a matter of fact, I did read the FAQ. However, in the FAQ coef() is used 
 to return the
 coefficients of lm() if it succeeded.
 I cannot find similar function for pvalue.
 
 
 So your question has nothing to do with the subject line? If you are trying 
 to get
 information about the object returned by the wilcox.test function,  then you 
 should be
 looking at the help page in the Value section for that function.
 
 
 --
 David.
 
 
 
 
 
 
  CC: r-help@r-project.org
  From: dwinsem...@comcast.net
  To: bac...@hotmail.com
  Subject: Re: [R] ignore error getting next result
  Date: Tue, 27 Mar 2012 13:40:39 -0400
 
 
  On Mar 27, 2012, at 12:56 PM, C Lin wrote:
 
  
   Dear All,
  
   How do I ignore an error and still getting result of next iteration.
   I am trying to do wilcox.test on a loop, when the test fail, I would
   like to continue doing the next iteration and getting the p-value.
   I tried to do tryCatch or try but I cannot retrieve the p-value if
   the test is not fail.
  
   sample code:
  
   test2=list(numeric(0),c(10,20));
   test1=list(c(1),c(1,2,3,4));
   for (i in 1:2){
   wtest=wilcox.test(test1[[i]],test2[[i]])
   }
  
   i=1 will fail, I want to ignore this and get the pvalue for i=2.
 
  Please read the FAQ entry And you would be advise to read through the
  rest of the FAQ as well.
 
  http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-
 in-a-long-simulation_003f
 
 
  
   Thanks,
   Lin
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  David Winsemius, MD
  West Hartford, CT
 
 
 
 
 
 
 David Winsemius, MD
 West Hartford, CT
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on predict.lm

2012-03-27 Thread Peter Ehlers


R tries hard to keep you from committing scientific abuse.
As stated, your problem seems to me akin to

1. Given that a man's age can be modelled as a function
of the grayness of his hair,
2. predict a man's age from the temperature in Barcelona.

Your calibration relates 'abs' and 'conc'. Now you want
to predict 'abs' from 'hours' (I think). I suspect that
concentration is actually related to time and this is
the missing link that

BTW, I'm surprised that you didn't find the requirement
for 'newdata' to be a data frame on the predict.lm help
page - it's pretty clearly stated there.

Peter Ehlers


On 2012-03-27 10:24, Nederjaard wrote:

Hello,

I'm new here, but will try to be as specific and complete as possible. I'm
trying to use “lm“ to first estimate parameter values from a set of
calibration measurements, and then later to use those estimates to calculate
another set of values with “predict.lm”.

First I have a calibration dataset of absorbance values measured from
standard solutions with known concentration of Bromide:


stds

   abs conc
1 -0.00210
2  0.1003  200
3  0.2395  500
4  0.3293  800

On this small calibration series, I perform a linear regression to find the
parameter estimates of the relationship between absorbance (abs) and
concentration (conc):


linear1- lm(abs~conc, data=stds)
summary(linear1)


Call:
lm(formula = abs ~ conc, data = stds)

Residuals:
 1 2 3 4
-0.012600  0.006467  0.020667 -0.014533

Coefficients:
  Estimate Std. Error t value Pr(|t|)
(Intercept) 1.050e-02  1.629e-02   0.645  0.58527
conc4.167e-04  3.378e-05  12.333  0.00651 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02048 on 2 degrees of freedom
Multiple R-squared: 0.987,  Adjusted R-squared: 0.9805
F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651





Now I come with another dataset, which contains measured absorbance values
of Bromide in solution:


brom

 hours abs
1-1.0  0.0633
2 1.0  0.2686
3 5.0  0.2446
418.0  0.2274
529.0  0.2091
642.0  0.1961
753.0  0.1310
876.0  0.1504
991.0  0.1317
10   95.5  0.1169
11  101.0  0.0977
12  115.0  0.1023
13  123.5  0.0879
14  138.5  0.0724
15  147.5  0.0564
16  163.0  0.0495
17  171.0  0.0325
18  189.0  0.0182
19  211.0  0.0047
20  212.5  NA
21  815.5 -0.2112
22  816.5 -0.1896
23  817.5 -0.0783
24  818.5  0.2963
25  819.5  0.1448
26  839.5  0.0936
27  864.0  0.0560
28  888.0  0.0310
29  960.5  0.0056
30 1009.0 -0.0163

The values in column brom$abs, measured on 30 subsequent points in time need
to be calculated to Bromide concentrations, using the previously established
relationship “linear1”.
At first, I thought it could be done by:


predict.lm(linear1, brom$abs)

Error in eval(predvars, data, env) :
   numeric 'envir' arg not of length one

But, R gives the above error message. Then, after some searching around on
different fora and R-communities (including this one), I learned that the
“newdata” in “predict.lm” actually needs to be coerced into a separate
dataframe. Thus:


mabs- data.frame(Abs = brom$abs)
predict.lm(linear1, mabs)

Error in eval(expr, envir, enclos) : object 'conc' not found

Again, R gives an error...probably because I made an error, but I truly fail
to see where. I hope somebody can explain to me clearly what I'm doing wrong
and what I should do to instead.
Any help is greatly appreciated, thanks !

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ZAGA predictions in GAMLSS

2012-03-27 Thread seefledermaus
Hello, 
I am modelling positive continuous data (including zeros) using the ZAGA 
distribution in GAMLSS and want to use the model for predictions. My final 
model includes smoothers (pb()) for the mu and nu parameter. 
First, I blindly used the default options for predictions but noticed that I 
do not have any zero values (or close to). Knowing this cannot be true, I 
learned that I also need the predictions for the other parameters (and not only 
mu as done by default), which I can extract e.g. with predictAll. 
My question is, how to combine all parameter values to calculate the expected 
value for one observation. 
 
It seems the function 'meanZAGA' does what I want, however not for new data. I 
tried to calculate the values I received with meanZAGA by hand in order to 
repeat it for predictions with new data but do not understand how to do it.
I would appreciate any advise.

Thank you very very much!
Cheers, 
Astrid 



-- 

Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.octave fails with data from Octave 3.2.X

2012-03-27 Thread Helios de Rosario
I wrote in my previous message the following Octave code:

[Octave code:]
octave:1 x=1;
octave:2 save -ascii testdata.mat x

Forget the -ascii. It should be -text or nothing (-text is the
default).

By the way, read.octave() does not really fail (it does return a
value), but the result is somewhat corrupted: it contains the exported
x variable, plus other empty elements corresponding to the blank
lines, I think.

Helios
INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting patient drug timelines using ggplot2 (or some other means) -- Help!!!

2012-03-27 Thread Paul Miller
Hello Dr. Winsemius,

Not sure how or if the use of NAs you describe applies to my case. I'll go back 
to this again when the ggplot2 book arrives. It may be that this will provide a 
helpful insight then.

Thanks,

Paul

--- On Fri, 3/23/12, David Winsemius dwinsem...@comcast.net wrote:

 From: David Winsemius dwinsem...@comcast.net
 Subject: Re: [R] Plotting patient drug timelines using ggplot2 (or some other 
 means) -- Help!!!
 To: Paul Miller pjmiller...@yahoo.com
 Cc: R. Michael Weylandt michael.weyla...@gmail.com, Petr PIKAL 
 petr.pi...@precheza.cz, Bert Gunter gunter.ber...@gene.com, 
 r-help@r-project.org
 Received: Friday, March 23, 2012, 1:23 PM
 
 On Mar 23, 2012, at 2:15 PM, Paul Miller wrote:
 
  Hi Michael and Petr,
  
  Apologize for my failure to grasp what you were saying.
 My code is up and running now.
  
  Noticed what might be a shortcoming of my ggplot code.
 I have some instances where a drug starts and stops and then
 starts and stops again. It looks like my graphs show just a
 single unbroken line segment though.
 
 Put in NA entries at times you do not want plotted. Not sure
 exactly how that gets handled in ggplot but since plotting
 nothing was the usual behavior in base and lattice
 graphics, I would think that would have gotten carried
 over.
 
 
  I ordered Hadley Wickham's ggplot2 book earlier today.
 So hopefully I'll be able to figure that out myself once the
 book arrives.
  
  Thank you Michael, Petr, and Bert for your help with
 this. Thanks especially to Michael for patiently answering
 all my questions over the last day or so.
  
  Paul
 
 
 David Winsemius, MD
 West Hartford, CT
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert day of year back into a date format.

2012-03-27 Thread Prof Brian Ripley

On 27/03/2012 19:30, Justin Haynes wrote:

There may very well be a better solution, but this works.

format(strptime(dayofyear, format=%j), format=%m-%d)


The answer depends on the year (think leap years), so I think you need

strptime(paste(2008, dayofyear), format=%Y %j)

Probably a better idea is

as.Date(dayofyear - 1, origin = 2008-01-01)

(as Jan 1 is day 1).


On Tue, Mar 27, 2012 at 11:12 AM, Sam Alberstonightstheni...@gmail.comwrote:


Hello,

I am having trouble figuring out how to convert a Day of Year integer
back into a Date format. For example I have the following:

date-
c('2008-01-01','2008-01-02','2008-01-03','2008-01-04','2008-01-05','2008-01-06','2008-01-07',

'2008-01-08','2008-01-09','2008-01-10','2008-01-11','2008-01-12','2008-01-13','2008-01-14','2008-01-15',

'2008-01-16','2008-01-17','2008-01-18','2008-01-19','2008-01-20','2008-01-21','2008-01-22','2008-01-23')

## this is then converted into a number corresponding to the day of
the year like so:

dayofyear- strptime(date, format=%Y-%m-%d)$yday + 1

## Now my question is how do I get back to a date format (obviously
omitting the year).
## The end result is that I'd like to be able to have axis labels as
something like Month-Day or just Month
## instead of just an integers which isn't always intuitive for people
but I can't seem to figure out how to tell R
## to recognize an integer as a date.

Any suggestions?

Many thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] survplot function

2012-03-27 Thread Thorsten Raff
Dear R-helpers

I am wondering if there is an option to the survplot function in the design 
package that allows for drawing Kaplan-Meier plots starting from 0 instead of 
1, similar like fun = 'event' in the standard plotting function used on a 
survfit object.
I apologize in advance for having missed any obvious informational sources but 
I really didn't find anything in the documentation.

Best regards

Thorsten Raff

-- 
Thorsten Raff
2nd Medical Department,
University Hospital Schleswig-Holstein, Campus Kiel
Chemnitzstraße 33
24116 Kiel
GERMANY

phone: +49 431 1697-5234
fax:   +49 431 1697-1264

email: t.raffatmed2.uni-kiel.de
web:   www.uk-sh.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to change the color of tcltk widget background color

2012-03-27 Thread mrzung
hi, I'm a beginner of tcltk packages.
I'm making some gui for some function and want to change the background
color that is grey in default.
anybody who knows the way that changes the color of it  plz teach me how to
do that.

Forthemore is there a nice manual for tclck?

Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-change-the-color-of-tcltk-widget-background-color-tp4509989p4509989.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What error distribution should I use?

2012-03-27 Thread Ben Bolker
Lívia Dorneles Audino livia.audino at gmail.com writes:

 
 I'm trying to make a glmm to identify the relationship between insect
 species richness with fragment size, isolation and time (different years).
 I already tried to analyse it using poisson distribution error, but I
 always face with the following warning:
 *glm.fit: fitted probabilities numerically 0 or 1 occurred *
 
 This is probably hapenning because my dataset has a lot of zeros. So, what
 error distribution should I use?
 

  I know you haven't gotten a lot of help on r-sig-mixed-models (sorry),
but it would probably be better to post this question there.  The answer
is that this is a warning, not an error, so it indicates a need for
caution but not necessarily that anything is wrong.  In this case,
an internal call to glm.fit() has difficulty when it tries to fit
a subset of that data that are all-zero or all-one.  It's quite possibly
OK, provided that you've looked at your results, plotted predicted values,
etc., and everything seems to make sense.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Year of data collection for 'diamonds' dataset in ggplot2

2012-03-27 Thread Hadley Wickham
I believe it was 2008.
Hadley

On Mon, Mar 26, 2012 at 11:46 AM, Marina Doucerain
marinadoucer...@gmail.com wrote:
 Hello,

 I'm wondering what was the year (or year range) of collection for the data
 included in the 'diamonds' dataset in ggplot2.
 This information would be very helpful in interpreting the 'price' variable.

 Thank you!

 Marina Doucerain

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calling java from R and using java time series double precision array

2012-03-27 Thread Hurr
I solved some of my problems, but the one that remains is 
that reading the two-dimensional arrays into R transposes the matrix. 
The arrays I want to read are unequal interval time multi series with
the first column being the times which are converted in java from 
calendar CnYrMoDaHrMnScDCMQ or CnYrMoDa and between to linear. 
What R programs do I use to plot and analyse this kind of time series?
How can I prevent transpose. 
THIS IS THE JAVA CODE: 
public class Transf2R { 
  Transf2R transf2R; 
  public static void main(String[]args) { 
Transf2R transf2R=new Transf2R(); transf2R.transf2R=transf2R;
transf2R.transf2R.main2(); 
  } 
  public static void main2() { double[][]arRet=arReturnMethod(); 
for(int i=0;i9;i++) { for(int
j=0;j3;j++)System.out.print((int)con2Arr[i][j]+,); System.out.println();
} 
for(int i=0;i9;i++) { for(int
j=0;j3;j++)System.out.print((int)arRet[i][j]+,); System.out.println(); } 
  } 
  public final static double con0dbl=10001; 
  public final static double[]con1Vec=new double[] {
10001,10002,10003,10004,10005,10006 }; 
  public final static double[][]con2Arr=new double[][] { 
{ 10001,10002,10003 },{ 20001,20002,20003 },{ 30001,30002,30003 },{
40001,40002,40003 } 
   ,{ 50001,50002,50003 },{ 60001,60002,60003 },{ 70001,70002,70003 },{
80001,80002,80003 } 
   ,{ 90001,90002,90003 } 
  }; 
  public final static double[][]arReturnMethod() { 
double[][]retArr=new double[9][3]; for(int i=0;i9;i++)for(int
j=0;j3;j++)retArr[i][j]=(i+1)*1000+j+1; return(retArr); 
  } 
  public final static double[][]dbl2DimArRet4R(double[][]dbl2DimAr4R) {
return(dbl2DimAr4R); } 
  public final static double[]dbl1DimVcRet4R(double[]dbl1DimVc4R) {
return(dbl1DimVc4R); } 
  public final static double dblRet4R(double dbl4R) { return(dbl4R); } 
public final static double dblNum4R=Math.PI; 
}
WHICH PRODUCES THIS UPON RUNNING MAIN(): 
10001,10002,10003,
20001,20002,20003,
30001,30002,30003,
40001,40002,40003,
50001,50002,50003,
60001,60002,60003,
70001,70002,70003,
80001,80002,80003,
90001,90002,90003,
1001,1002,1003,
2001,2002,2003,
3001,3002,3003,
4001,4002,4003,
5001,5002,5003,
6001,6002,6003,
7001,7002,7003,
8001,8002,8003,
9001,9002,9003,
I FINALLY FIGURED OUT SOME R CODE THAT DEMONSTRATES WHAT I WANT TO DO: 
 library(rJava) # loads package 
 .jinit()   # starts JVM 
[1] 0
 .jaddClassPath(C:/ad/j)
 print(.jclassPath())
[1] C:\\Users\\ENVY17\\Documents\\R\\win-library\\2.13\\rJava\\java
C:\\ad\\j  
 trnsfer2R - .jnew(Transf2R) # creates link to java class 
 arj9x3Ret -
 sapply(.jcall(trnsfer2R,returnSig=[[D,arReturnMethod),.jevalArray)
 print(arj9x3Ret) # note: row and column indices get interchanged 
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1001 2001 3001 4001 5001 6001 7001 8001 9001
[2,] 1002 2002 3002 4002 5002 6002 7002 8002 9002
[3,] 1003 2003 3003 4003 5003 6003 7003 8003 9003
 dblNum - .jcall(trnsfer2R,returnSig=D,dblRet4R,trnsfer2R$dblNum4R)
 print(dblNum,digits=20) 
[1] 3.141592653589793116
 conn1Vec -
 .jcall(trnsfer2R,returnSig=[D,dbl1DimVcRet4R,trnsfer2R$con1Vec) #
 con1Vec is java one dim array of double precision constants 
 print(conn1Vec) 
[1] 10001 10002 10003 10004 10005 10006
 conn2Arr -
 sapply(.jcall(trnsfer2R,returnSig=[[D,dbl2DimArRet4R,.jfield(trnsfer2R,
 [[D, con2Arr, convert=F)),.jevalArray) 
 print(conn2Arr) 
  [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9]
[1,] 10001 20001 30001 40001 50001 60001 70001 80001 90001
[2,] 10002 20002 30002 40002 50002 60002 70002 80002 90002
[3,] 10003 20003 30003 40003 50003 60003 70003 80003 90003
 
BUT THE TWO-DIMENSIONAL ARRAYS SEEM TO BE TRANSPOSED.



--
View this message in context: 
http://r.789695.n4.nabble.com/calling-java-from-R-and-using-java-time-series-double-precision-array-tp4494581p4510410.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What error distribution should I use?

2012-03-27 Thread Peter Ehlers

On 2012-03-27 15:11, Ben Bolker wrote:

Lívia Dorneles Audinolivia.audinoat  gmail.com  writes:



I'm trying to make a glmm to identify the relationship between insect
species richness with fragment size, isolation and time (different years).
I already tried to analyse it using poisson distribution error, but I
always face with the following warning:
*glm.fit: fitted probabilities numerically 0 or 1 occurred *

This is probably hapenning because my dataset has a lot of zeros. So, what
error distribution should I use?



   I know you haven't gotten a lot of help on r-sig-mixed-models (sorry),
but it would probably be better to post this question there.  The answer
is that this is a warning, not an error, so it indicates a need for
caution but not necessarily that anything is wrong.  In this case,
an internal call to glm.fit() has difficulty when it tries to fit
a subset of that data that are all-zero or all-one.  It's quite possibly
OK, provided that you've looked at your results, plotted predicted values,
etc., and everything seems to make sense.



Livia:
 You might also find this quite extensive recent post from Ted Harding
informative:

  https://stat.ethz.ch/pipermail/r-help/2012-March/307352.html

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rgdal package - get information

2012-03-27 Thread Michael Sumner
There is a mailing list R-Sig-Geo which is more appropriate for
questions about the rgdal and related packages.

If by read the information GDType you mean to get that Int16
description you can get it by delving into the attributes of the
GDALinfo return value, for example:

f - system.file(pictures/erdas_spnad83.tif, package = rgdal)[1]
attr(GDALinfo(f), df)[GDType]
  GDType
1   Byte

In your case that would be

attr(GDALinfo(MOD13Q1.A2001049.h13v11.005.2007002215512.250m_16_days_EVI.tif),
df)[GDType]

If you just mean to read the data into R, then use readGDAL from the
rgdal package.  Extensions to this support that simplify some matters
are available in the raster package.

Cheers, Mike.

On Wed, Mar 28, 2012 at 3:40 AM, julio cesar oliveira oliveir...@ufv.br wrote:

 Hi,

 I used
 GDALinfo(MOD13Q1.A2001049.h13v11.005.2007002215512.250m_16_days_EVI.tif)  
 and
 got the results:

 rows        10
 columns     11
 bands       1
 origin.x        150701.4
 origin.y        7744897
 res.x       250
 res.y       250
 ysign       -1
 oblique.x   0
 oblique.y   0
 driver      GTiff
 projection  +proj=utm +zone=23 +south +datum=WGS84 +units=m +no_defs
 file
  /MOD13Q1.A2001049.h13v11.005.2007002215512.250m_16_days_EVI.tif
 apparent band summary:
   *GDType*   Bmin  Bmax Bmean Bsd hasNoDataValue NoDataValue
 1  *Int16* -32768 32767     0   0          FALSE           0
 Metadata:
 AREA_OR_POINT=Point
 TIFFTAG_SOFTWARE=MODIS Reprojection Tool  v4.1 March 2009



 *How to read the information GDType?*


 Thanks,

 julio

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Michael Sumner
Institute for Marine and Antarctic Studies, University of Tasmania
Hobart, Australia
e-mail: mdsum...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assigning vector or matrix sparsely (for use with mclapply)

2012-03-27 Thread ilai
It is (at least for me) really unclear what the problem is, or how
it's related to mclapply.
You say
 this works fine, except that what I want to get NA's in the return
 positions that were not recalculated.  then, I can write

  newdata$y - ifelse ( is.na(olddata$y), mc.byselectrows( olddata,
 is.na(olddata$y), fun.calc.y ), olddata$y )

Why ???
Are you applying the function twice ?  than why not simply
v1.1 - mc.byselectrows( d, loc1, function(x) x[,2]^2 )
the second time ?

If the problem is in keeping track of which rows got calculated, why
not rename with the row.names omitted after mclapply (probably a good
idea anyway):

FUN.ON.ROWS - function(.index, ...)
as.matrix(FUN(data.notdone[.index,], ...))
  soln - mclapply( as.list(1:nrow(data.notdone)) , FUN.ON.ROWS, ... )
  rv - do.call(rbind, soln)  ## omits naming.
  if (ncol(rv)==1){ rv - as.vector(rv) ; names(rv) - row.names(data.notdone) }
  else rownames(rv) - row.names(data.notdone)
 rv
}

And finally, you don't even need row.names for c(v1,d[loc1,2])

Or am I missing something here ?

BTW your code uses cat.stderr (which is local ? ) instead of cat, and
has no call to multicore.

Cheers




On Mon, Mar 26, 2012 at 4:28 PM, ivo welch ivo.we...@gmail.com wrote:
 Dear R wizards---

 I have a wrapper on mclapply() that makes it a little easier for me to
 do multiprocessing.  (Posting this may make life easier for other
 googlers.)  I pass a data frame, a vector that tells me what rows
 should be recomputed, and the function; and I get back a vector or
 matrix of answers.

   d - data.frame( id=1:6, val=11:16 )
   loc - c(TRUE,TRUE,FALSE,TRUE,FALSE,TRUE)
   v1 - mc.byselectrows( d, loc, function(x) x[,2]^2 )
   v2 - mc.byselectrows(d, loc, function(x) cbind(x[,2]^2,x[,2]^3))

 mc.byselectrows - function(data.in, recalclist, FUN, ...) {

   data.notdone - data.in[recalclist,]
   cat.stderr([mc.byselectrows: , nrow(data.notdone), rows to be
 recomputed out of, nrow(data.in), ]\n)

   FUN.ON.ROWS - function(.index, ...)
 as.matrix(FUN(data.notdone[.index,], ...))
   soln - mclapply( as.list(1:nrow(data.notdone)) , FUN.ON.ROWS, ... )
   rv - do.call(rbind, soln)  ## omits naming.
   if (ncol(rv)==1) rv - as.vector(rv)
   rv
 }

 this works fine, except that what I want to get NA's in the return
 positions that were not recalculated.  then, I can write

  newdata$y - ifelse ( is.na(olddata$y), mc.byselectrows( olddata,
 is.na(olddata$y), fun.calc.y ), olddata$y )

 I can do this very inelegantly, of course.  I can merge recalclist
 into data.in and then write a loop that substitutes for the do.call to
 rbind.  yikes.  or I could do the recalclist contingency inside the
 FUN.ON.ROWS, but this is costly in terms of execution time.  are there
 obvious solutions?  advice appreciated.

 regards,

 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is it possible to de-select with sqlQuery from the RODBC library?

2012-03-27 Thread Eric Fail
Dear R-list,

I'm queering a M$ Access database with the sqlQuery function from the RODBC 
library. As I cannot make a working example with a database here is an 
illustrative example,

library(RODBC)
mdbConnect-odbcConnectAccess(S:/data/ ... /databse.mdb)
data - sqlQuery(mdbConnect, select id, DOB, V1, V2, ..., V1009, V1011, V1013 
from someTable)

I want everything in the table (someTable), except 'V1010' and 'V1012,' but I 
can't figure out how to make a negative or reverse SQL select statement. I have 
a lot of someTables and I have two or three variables in each table that I do 
not want R to fetch,

Is there a way to define a reverse select in SQL? One would imagine it would 
look something like this,

data - sqlQuery(mdbConnect, deselect V1010, V1o12 from someTable)

Thanks,
Eric

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assigning vector or matrix sparsely (for use with mclapply)

2012-03-27 Thread ivo welch
I wasn't thinking straight.

 old.data= 11:20
 recalc.please= (old.data%%2==0)
 old.data[recalc.please]
[1] 12 14 16 18 20
 new.data[recalc.please]= old.data[recalc.please]^2
Error in new.data[recalc.please] = old.data[recalc.please]^2 :
  object 'new.data' not found

# this is where I had given up, but the following works:

 new.data=old.data
 new.data[recalc.please]= old.data[recalc.please]^2
 new.data
 [1]  11 144  13 196  15 256  17 324  19 400

sorry, guys.

/iaw

Ivo Welch (ivo.we...@gmail.com)


On Tue, Mar 27, 2012 at 7:27 PM, ilai ke...@math.montana.edu wrote:
 It is (at least for me) really unclear what the problem is, or how
 it's related to mclapply.
 You say
  this works fine, except that what I want to get NA's in the return
  positions that were not recalculated.  then, I can write

  newdata$y - ifelse ( is.na(olddata$y), mc.byselectrows( olddata,
 is.na(olddata$y), fun.calc.y ), olddata$y )
 
 Why ???
 Are you applying the function twice ?  than why not simply
 v1.1 - mc.byselectrows( d, loc1, function(x) x[,2]^2 )
 the second time ?

 If the problem is in keeping track of which rows got calculated, why
 not rename with the row.names omitted after mclapply (probably a good
 idea anyway):

 FUN.ON.ROWS - function(.index, ...)
 as.matrix(FUN(data.notdone[.index,], ...))
  soln - mclapply( as.list(1:nrow(data.notdone)) , FUN.ON.ROWS, ... )
  rv - do.call(rbind, soln)  ## omits naming.
  if (ncol(rv)==1){ rv - as.vector(rv) ; names(rv) - row.names(data.notdone) 
 }
  else rownames(rv) - row.names(data.notdone)
  rv
 }

 And finally, you don't even need row.names for c(v1,d[loc1,2])

 Or am I missing something here ?

 BTW your code uses cat.stderr (which is local ? ) instead of cat, and
 has no call to multicore.

 Cheers




 On Mon, Mar 26, 2012 at 4:28 PM, ivo welch ivo.we...@gmail.com wrote:
 Dear R wizards---

 I have a wrapper on mclapply() that makes it a little easier for me to
 do multiprocessing.  (Posting this may make life easier for other
 googlers.)  I pass a data frame, a vector that tells me what rows
 should be recomputed, and the function; and I get back a vector or
 matrix of answers.

   d - data.frame( id=1:6, val=11:16 )
   loc - c(TRUE,TRUE,FALSE,TRUE,FALSE,TRUE)
   v1 - mc.byselectrows( d, loc, function(x) x[,2]^2 )
   v2 - mc.byselectrows(d, loc, function(x) cbind(x[,2]^2,x[,2]^3))

 mc.byselectrows - function(data.in, recalclist, FUN, ...) {

   data.notdone - data.in[recalclist,]
   cat.stderr([mc.byselectrows: , nrow(data.notdone), rows to be
 recomputed out of, nrow(data.in), ]\n)

   FUN.ON.ROWS - function(.index, ...)
 as.matrix(FUN(data.notdone[.index,], ...))
   soln - mclapply( as.list(1:nrow(data.notdone)) , FUN.ON.ROWS, ... )
   rv - do.call(rbind, soln)  ## omits naming.
   if (ncol(rv)==1) rv - as.vector(rv)
   rv
 }

 this works fine, except that what I want to get NA's in the return
 positions that were not recalculated.  then, I can write

  newdata$y - ifelse ( is.na(olddata$y), mc.byselectrows( olddata,
 is.na(olddata$y), fun.calc.y ), olddata$y )

 I can do this very inelegantly, of course.  I can merge recalclist
 into data.in and then write a loop that substitutes for the do.call to
 rbind.  yikes.  or I could do the recalclist contingency inside the
 FUN.ON.ROWS, but this is costly in terms of execution time.  are there
 obvious solutions?  advice appreciated.

 regards,

 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] One last thing

2012-03-27 Thread Fretheim, Alexander H
Dear R,

 Thanks for helping me locate the source for the StructTS method from 
stats, but I've run in to a roadblock in reverse engineering it to locate a 
formula for its forecasting because it calls some compiled C code, a function 
called KalmanLike. I've looked through that R library that the StructTS method 
code was located in and could not find it.

 Sincerely,

  Alexander Fretheim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Zero inflated GAMM

2012-03-27 Thread Neil Collier
Bert,

Try posting on the R-sig-ME list for help with mixed models.

Cheer,

Neil


On Wed, Mar 28, 2012 at 1:16 AM, Bert Harris aramidop...@gmail.com wrote:

 HI all,

 I am planning to get Zuur et al.'s new book when it comes out, but until
 then I was wondering if anyone could suggest examples of zero inflated or
 hurdle GAMMs. I have count data with many zeros, non-linear relationships,
 and site as a random effect.

 Thank you!
 Bert Harris, University of Adelaide

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R extract parts

2012-03-27 Thread Rui Barradas
Hello,


 my idea is to get results like this:
 user, sector, source, destine, count, average
 7 1  22  22 4  186.25 #
 (109+100+214+322)
 7 2  161   97  1  68
 7 2  97   97  1  196
 7 2  97   22  1  427
 7 2  22   22  2  383 
 

Your second column, 'sector', comes from where? What is it?

Without it, try this.


text=
user poscommunications source v_destine
7   1   109   2222
7   2   100   2222
7   3   214   2222
7   4   322   2222
7   5  69920 22   161
7   6   68  16197
7   7  196   9797
7   8   427   9722
7   9460   2222
7  10   307   2222
7  11  9582   2222
7  12   55428   2222
7  139192   2222
7  14  19   2222 

df1 - read.table(textConnection(text), header=TRUE)

inx - df1$comm  1000
comm1000 - cumsum(inx)

result - split(df1[!inx, ], list(comm1000[!inx], df1$source[!inx],
df1$v_destine[!inx]))
result - sapply(result, function(x) c(x$user[1], x$source[1],
x$v_destine[1], nrow(x), mean(x$comm)))
result - na.exclude(t(result))

rownames(result) - 1:nrow(result)
colnames(result) - c(user, source, v_destine, count, average)
attr(result, na.action) - NULL
attr(result, class) - NULL

result


Hope this helps,

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/R-extract-parts-tp4509042p4510566.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What error distribution should I use?

2012-03-27 Thread chuck.01
Could you please post a small example of your data and code which gives you
this error.  Your assumed error distribution sounds reasonable.  I am
interested as to why you have zeros... you have sites with species richness
==0 ??



Lívia Dorneles Audino wrote
 
 I'm trying to make a glmm to identify the relationship between insect
 species richness with fragment size, isolation and time (different years).
 I already tried to analyse it using poisson distribution error, but I
 always face with the following warning:
 *glm.fit: fitted probabilities numerically 0 or 1 occurred *
 
 This is probably hapenning because my dataset has a lot of zeros. So, what
 error distribution should I use?
 
 -- 
 *Lívia *
 
   [[alternative HTML version deleted]]
 
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/What-error-distribution-should-I-use-tp4509479p4510351.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >