Re: [R] For Loop

2018-09-23 Thread Wensui Liu
Very insightful. Thanks, Duncan

Based on your opinion, is there any benefit to use the parallelism in the
corporate computing environment where the size of data is far more than
million rows and there are multiple cores in the server.

Actually the practice of going concurrency or not is more related to my
production tasks instead of something academic.

Really appreciate your thoughts.

On Sun, Sep 23, 2018 at 2:42 PM Duncan Murdoch 
wrote:

> On 23/09/2018 3:31 PM, Jeff Newmiller wrote:
>
> [lots of good stuff deleted]
>
> > Vectorize is
> > syntactic sugar with a performance penalty.
>
> [More deletions.]
>
> I would say Vectorize isn't just "syntactic sugar".  When I use that
> term, I mean something that looks nice but is functionally equivalent.
>
> However, Vectorize() really does something useful:  some functions (e.g.
> outer()) take other functions as arguments, but they assume the argument
> is a vectorized function.  If it is not, they fail, or generate garbage
> results.  Vectorize() is designed to modify the interface to a function
> so it acts as if it is vectorized.
>
> The "performance penalty" part of your statement is true.  It will
> generally save some computing cycles to write a new function using a for
> loop instead of using Vectorize().  But that may waste some programmer
> time.
>
> Duncan Murdoch
> (writing as one of the authors of Vectorize())
>
> P.S. I'd give an example of syntactic sugar, but I don't want to bruise
> some other author's feelings :-).
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For Loop

2018-09-23 Thread Wensui Liu
what you measures is the "elapsed" time in the default setting. you
might need to take a closer look at the beautiful benchmark() function
and see what time I am talking about.

I just provided tentative solution for the person asking for it  and
believe he has enough wisdom to decide what's best. why bother to
judge others subjectively?
On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn  wrote:
>
> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu  wrote:
> >
> > actually, by the parallel pvec, the user time is a lot shorter. or did
> > I somewhere miss your invaluable insight?
> >
> > > c1 <- 1:100
> > > len <- length(c1)
> > > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
> >   test replications elapsed relative user.self sys.self
> > 1 log(c1[-1]/c1[-len])  100   4.6171 4.4840.133
> >   user.child sys.child
> > 1  0 0
> > > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) 
> > > log(c1[i + 1] / c1[i])), replications = 100)
> >test
> > 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
> >   replications elapsed relative user.self sys.self user.child sys.child
> > 1  100   9.0791 2.5714.138  9.736 8.046
>
> Your output is mangled in my email, but on my system your pvec
> approach takes more than twice as long:
>
> c1 <- 1:100
> len <- length(c1)
> library(parallel)
> library(rbenchmark)
>
> regular <- function() log(c1[-1]/c1[-len])
> iterate.parallel <- function() {
>   pvec(1:(len - 1), mc.cores = 4,
>function(i) log(c1[i + 1] / c1[i]))
> }
>
> benchmark(regular(), iterate.parallel(),
>   replications = 100,
>   columns = c("test", "elapsed", "relative"))
> ## test elapsed relative
> ## 2 iterate.parallel()   7.5172.482
> ## 1  regular()   3.0281.000
>
> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
> to understand and it runs pretty fast. There is usually no reason to
> make it more complicated.
> --Ista
>
> > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn  wrote:
> > >
> > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu  wrote:
> > > >
> > > > Why?
> > >
> > > The operations required for this algorithm are vectorized, as are most
> > > operations in R. There is no need to iterate through each element.
> > > Using Vectorize to achieve the iteration is no better than using
> > > *apply or a for-loop, and betrays the same basic lack of insight into
> > > basic principles of programming in R.
> > >
> > > And/or, if you want a more practical reason:
> > >
> > > > c1 <- 1:100
> > > > len <- 100
> > > > system.time( s1 <- log(c1[-1]/c1[-len]))
> > >user  system elapsed
> > >   0.031   0.004   0.035
> > > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > >user  system elapsed
> > >   1.258   0.022   1.282
> > >
> > > Best,
> > > Ista
> > >
> > > >
> > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn  wrote:
> > > >>
> > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu  wrote:
> > > >> >
> > > >> > or this one:
> > > >> >
> > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > > >>
> > > >> Oh dear god no.
> > > >>
> > > >> >
> > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8  
> > > >> > wrote:
> > > >> > >
> > > >> > >
> > > >> > > It is my impression that good R programmers make very little use 
> > > >> > > of the
> > > >> > > for statement. Please consider  the following
> > > >> > > R statement:
> > > >> > >  for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = 
> > > >> > > exp(1) )
> > > >> > > One problem I have found with this statement is that s must exist 
> > > >> > > before
> > > >> > > the statement is run. Can it be written without using a for
> > > >> > > loop? Would that be better?
> > > >> > >
> > > >> > > Thanks,
> > > >&g

Re: [R] For Loop

2018-09-23 Thread Wensui Liu
actually, by the parallel pvec, the user time is a lot shorter. or did
I somewhere miss your invaluable insight?

> c1 <- 1:100
> len <- length(c1)
> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
  test replications elapsed relative user.self sys.self
1 log(c1[-1]/c1[-len])  100   4.6171 4.4840.133
  user.child sys.child
1  0 0
> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 
> 1] / c1[i])), replications = 100)
   test
1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
  replications elapsed relative user.self sys.self user.child sys.child
1  100   9.0791 2.5714.138  9.736 8.046
On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn  wrote:
>
> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu  wrote:
> >
> > Why?
>
> The operations required for this algorithm are vectorized, as are most
> operations in R. There is no need to iterate through each element.
> Using Vectorize to achieve the iteration is no better than using
> *apply or a for-loop, and betrays the same basic lack of insight into
> basic principles of programming in R.
>
> And/or, if you want a more practical reason:
>
> > c1 <- 1:100
> > len <- 100
> > system.time( s1 <- log(c1[-1]/c1[-len]))
>user  system elapsed
>   0.031   0.004   0.035
> > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>user  system elapsed
>   1.258   0.022   1.282
>
> Best,
> Ista
>
> >
> > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn  wrote:
> >>
> >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu  wrote:
> >> >
> >> > or this one:
> >> >
> >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> >>
> >> Oh dear god no.
> >>
> >> >
> >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8  wrote:
> >> > >
> >> > >
> >> > > It is my impression that good R programmers make very little use of the
> >> > > for statement. Please consider  the following
> >> > > R statement:
> >> > >  for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = 
> >> > > exp(1) )
> >> > > One problem I have found with this statement is that s must exist 
> >> > > before
> >> > > the statement is run. Can it be written without using a for
> >> > > loop? Would that be better?
> >> > >
> >> > > Thanks,
> >> > > Bob
> >> > >
> >> > > __
> >> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > > PLEASE do read the posting guide 
> >> > > http://www.R-project.org/posting-guide.html
> >> > > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> > __
> >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide 
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For Loop

2018-09-22 Thread Wensui Liu
or this one:

(Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))

On Sat, Sep 22, 2018 at 4:16 PM rsherry8  wrote:
>
>
> It is my impression that good R programmers make very little use of the
> for statement. Please consider  the following
> R statement:
>  for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> One problem I have found with this statement is that s must exist before
> the statement is run. Can it be written without using a for
> loop? Would that be better?
>
> Thanks,
> Bob
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For Loop

2018-09-22 Thread Wensui Liu
another version just for fun

s <- parallel::pvec(1:len, function(i) log(c1[i + 1] / c1[i]))
On Sat, Sep 22, 2018 at 4:16 PM rsherry8  wrote:
>
>
> It is my impression that good R programmers make very little use of the
> for statement. Please consider  the following
> R statement:
>  for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> One problem I have found with this statement is that s must exist before
> the statement is run. Can it be written without using a for
> loop? Would that be better?
>
> Thanks,
> Bob
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best R GUIs

2017-12-13 Thread Wensui Liu
how could you miss emacs + ess?

On Wed, Dec 13, 2017 at 5:04 AM, Juan Telleria  wrote:

> Dear R Community Members,
>
> I would like to add to one article I have written the best Graphical User
> Interfaces the R programming language has.
>
> For the moment I know:
> A) Rstudio.
> B) R Tools for Visual Studio.
> C) Open Analytics Architect.
>
> Are there others worth to mention?
>
> Thank you.
>
> Kind regards,
> Juan Telleria
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] M-x R gives no choice of starting dir

2017-09-11 Thread Wensui Liu
i am using emacs on ubuntu and have no such issue.

On Mon, Sep 11, 2017 at 10:31 AM, Christian  wrote:

> Hi,
>
> I experienced a sudden change in the behavior of M-x R in not giving me
> the choice where to start R. May be that I botched my preferences. I am
> using Aquamacs 3.3 on MacOS 10.12.6
>
> Christian
> --
> Christian Hoffmann
> Rigiblickstrasse 15b
> 
> CH-8915 Hausen am Albis
> Switzerland
> Telefon +41-(0)44-7640853
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R and Python together

2017-03-31 Thread Wensui Liu
In https://statcompute.wordpress.com/?s=rpy2, you can find examples of rpy2.

In https://statcompute.wordpress.com/?s=pyper, you can find examples of pyper.

On Fri, Mar 31, 2017 at 11:38 AM, Kankana Shukla  wrote:
> I'm not great at rpy2.  Are there any good examples I could see to learn
> how to do that?  My R code is very long and complicated.
>
> On Fri, Mar 31, 2017 at 7:08 AM, Stefan Evert 
> wrote:
>
>>
>> > On 30 Mar 2017, at 23:37, Kankana Shukla  wrote:
>> >
>> > I have searched for examples using R and Python together, and rpy2 seems
>> > like the way to go, but is there another (easier) way to do it?
>>
>> Rpy2 would seem to be a very easy and convenient solution.  What do you
>> need that can't easily be down with rpy2?
>>
>> Best regards,
>> Stefan
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R and Python together

2017-03-30 Thread Wensui Liu
How about pyper?

On Thu, Mar 30, 2017 at 10:42 PM Kankana Shukla 
wrote:

> Hello,
>
> I am running a deep neural network in Python.  The input to the NN is the
> output from my R code. I am currently running the python script and calling
> the R code using a subprocess call, but this does not allow me to
> recursively change (increment) parameters used in the R code that would be
> the inputs to the python code.  So in short, I would like to follow this
> automated process:
>
>1. Parameters used in R code generate output
>2. This output is input to Python code
>3. If output of Python code > x,  stop
>4. Else, increment parameters used as input in R code (step 1) and
>repeat all steps
>
> I have searched for examples using R and Python together, and rpy2 seems
> like the way to go, but is there another (easier) way to do it?  I would
> highly appreciate the help.
>
> Thanks in advance,
>
> Kankana
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Run a Python code from R

2016-11-16 Thread Wensui Liu
take a look at rpython or rPithon package

On Wed, Nov 16, 2016 at 4:53 PM, Nelly Reduan  wrote:
> Hello,
>
>
> How can I run this Python code from R ?
>
>
 import nlmpy
 nlm = nlmpy.mpd(nRow=50, nCol=50, h=0.75)
 nlmpy.exportASCIIGrid("raster.asc", nlm)
>
>
> Nlmpy is a Python package to build neutral landscape models
>
> https://pypi.python.org/pypi/nlmpy . The example comes from this website. I 
> tried to use the function system2 but I don't know how to use it.
>
>
> path_script_python <- "C:/Users/Anaconda2/Lib/site-packages/nlmpy/nlmpy.py"
>
> test <- system2("python", args = c(path_script_python, as.character(nRow), 
> as.character(nCol), as.character(h)))
>
> Thanks a lot for your help.
> Nell
>
>
> nlmpy 0.1.3 : Python Package Index
> pypi.python.org
> NLMpy. NLMpy is a Python package for the creation of neutral landscape models 
> that are widely used in the modelling of ecological patterns and processes 
> across ...
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression

2016-01-25 Thread Wensui Liu
But beta can only be used to model the open interval between zero and one

On Monday, January 25, 2016, Greg Snow <538...@gmail.com> wrote:

> Do you have the sample sizes that the sample proportions were computed
> from (e.g. 0.5 could be 1 out of 2 or 100 out of 200)?
>
> If you do then you can specify the model with the proportions as the y
> variable and the corresponding sample sizes as the weights argument to
> glm.
>
> If you only have proportions without an integer sample size then you
> may want to switch to using beta regression instead of logistic
> regression.
>
> On Sat, Jan 23, 2016 at 1:41 PM, pari hesabi <statistic...@hotmail.com
> <javascript:;>> wrote:
> > Hello everybody,
> >
> > I am trying to fit a logistic regression model by using glm() function
> in R. My response variable is a sample proportion NOT binary numbers(0,1).
> >
> > Regarding glm() function, I receive this error:  non integer # successes
> in a binomial glm!
> >
> > I would appreciate if anybody conducts me.
> >
> >
> > Regards,
> >
> > Pari
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and
> more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com <javascript:;>
>
> __
> R-help@r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and
> more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
https://statcompute.wordpress.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic Regression

2016-01-23 Thread Wensui Liu
with glm(), you might try the quasi binomial family

On Saturday, January 23, 2016, pari hesabi <statistic...@hotmail.com> wrote:

> Hello everybody,
>
> I am trying to fit a logistic regression model by using glm() function in
> R. My response variable is a sample proportion NOT binary numbers(0,1).
>
> Regarding glm() function, I receive this error:  non integer # successes
> in a binomial glm!
>
> I would appreciate if anybody conducts me.
>
>
> Regards,
>
> Pari
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and
> more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
https://statcompute.wordpress.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use SQL in R environment

2016-01-15 Thread Wensui Liu
Check sqldf

On Friday, January 15, 2016, Amoy Yang via R-help <r-help@r-project.org>
wrote:

>  Hi All,
> I am new here and a beginner for R. Can I use SQL procedure in R
> environment as it can be done in SAS starting with PROC SQL;
> Thanks for helps!
>
> Amoy
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and
> more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
https://statcompute.wordpress.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to calculate the prediction interval without knowing the functional form

2016-01-03 Thread Wensui Liu
If I have predictions derived empirically without knowing the functional
form, is there a way to calculate the prediction interval?

Thanks


-- 
WenSui Liu
https://statcompute.wordpress.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R wont accept my zero count values in the GLM with quasi_poisson dsitribution

2015-09-08 Thread Wensui Liu
based on your code "fit <-
glm(abundance~Gender,data=teminfest,family=binomial())", i don't see
anything related to quasi_poisson. are you sure what you are doing
here?

On Tue, Jul 28, 2015 at 1:33 AM, Charlotte
<charlotte.hu...@griffithuni.edu.au> wrote:
> Hello
>
> I have count values for abundance which follow a pattern of over-dispersal
> with many zero values.  I have read a number of documents which suggest that
> I don't use data transforming methods but rather than I run the GLM with the
> quasi poisson distribution.  So I have written my script and R is telling me
> that Y should be more than 0.
>
> Everything I read tells me to do it this way but I can't get R to agree.
> Did I need to add something else to my script to get it to work and keep my
> data untransformed? The script I wrote is as follows:
>
>> fit <- glm(abundance~Gender,data=teminfest,family=binomial())
>
> then I get this error
> Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1
>
> I don't use R a lot so I am having trouble figuring out what to do next.
>
> I would appreciate some help
>
> Many Thanks
> Charlotte
>
>
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/R-wont-accept-my-zero-count-values-in-the-GLM-with-quasi-poisson-dsitribution-tp4710462.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
https://statcompute.wordpress.com/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] VIF threshold implying multicollinearity

2015-07-26 Thread Wensui Liu
Dear All
I have a general question about VIF.
While there are multiple rules of thumb about the threshold value of
VIF, e.g. 4 or 10, implying multicollinearity, I am wondering if
anyone can point me to some literature supporting these rules of
thumb.

Thank you so much!
wensui

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] model non-integer count outcomes

2015-07-22 Thread Wensui Liu
Thanks Thierry
What if I don't know the n in the offset term?

On Wednesday, July 22, 2015, Thierry Onkelinx thierry.onkel...@inbo.be
wrote:

 If you know the number of counts (n) used to calculate the average then you
 can still use a poisson distribution.

 Total = average * n
 glm(total ~ offset(n), family = poisson)

 ​
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
 Forest
 team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
 Kliniekstraat 25
 1070 Anderlecht
 Belgium

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to say
 what the experiment died of. ~ Sir Ronald Aylmer Fisher
 The plural of anecdote is not data. ~ Roger Brinner
 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of data.
 ~ John Tukey
 Op 22 jul. 2015 08:38 schreef Don McKenzie d...@u.washington.edu
 javascript:;:

  Or if there are enough averages of enough counts, the CLT provides
 another
  option.
 
   On Jul 21, 2015, at 8:38 PM, David Winsemius dwinsem...@comcast.net
 javascript:;
  wrote:
  
  
   On Jul 21, 2015, at 8:21 PM, Wensui Liu wrote:
  
   Dear Lister
   When the count outcomes are integers, we could use either Poisson or
   NB regression to model them. However, there are cases that the count
   outcomes are non-integers, e.g. average counts.
   I am wondering if it still makes sense to use Poisson or NB regression
   to model these non-integer outcomes.
  
   There is a quasi-binomial error model that accepts non-integer
 outcomes.
  
   --
  
   David Winsemius
   Alameda, CA, USA
  
   __
   R-help@r-project.org javascript:; mailing list -- To UNSUBSCRIBE
 and more, see
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 
 
  __
  R-help@r-project.org javascript:; mailing list -- To UNSUBSCRIBE and
 more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org javascript:; mailing list -- To UNSUBSCRIBE and
 more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
https://statcompute.wordpress.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] model non-integer count outcomes

2015-07-21 Thread Wensui Liu
Dear Lister
When the count outcomes are integers, we could use either Poisson or
NB regression to model them. However, there are cases that the count
outcomes are non-integers, e.g. average counts.
I am wondering if it still makes sense to use Poisson or NB regression
to model these non-integer outcomes.

Truly appreciate your attention and insight!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] alternatives to KS test applicable to K-samples

2015-05-30 Thread Wensui Liu
thanks for your comment, Bert
as pointed out by Brian, mrpp suits my need.

On Sat, May 30, 2015 at 2:02 PM, Bert Gunter bgunter.4...@gmail.com wrote:
 ... or in not testing at all. The distributions are not the same, period. So
 what. Testing for equality is useless -- the real question is: what issues
 are you trying to address/ what questions are you trying to answer/ can they
 be answered with the data you have or plan to get?

 In any case, this does not seem the proper venue for such matters, as it has
 nothing to do with R -- for now, anyways (mea culpa). I would suggest you
 post to a statistics list like stats.stackexchange.com to figure out what
 you want to do and then maybe come back here as necessary (after searching)
 to get any help you might need with R tools to do it.

 Cheers,
 Bert

 Bert Gunter

 Data is not information. Information is not knowledge. And knowledge is
 certainly not wisdom.
-- Clifford Stoll

 On Sat, May 30, 2015 at 10:42 AM, David Winsemius dwinsem...@comcast.net
 wrote:


 On May 30, 2015, at 7:09 AM, Wensui Liu wrote:

  Thanks for your insight, David
  But I am not interested in comparing means among multiple groups.
  Instead, I want to compare empirical distributions. In this case, I am
  not sure if wilcoxon should be still applicable.
 
  still appreciate it.

 The Wilcoxon Rank Sum is not comparing means (or medians as I mistakenly
 thought in the past) but is a more general test of location. You are correct
 in thinking that the KS test is implicitly testing a wider range of
 hypotheses, although it remains fairly weak against specific tests. I wasn't
 suggesting the coin package simply because of its capacity to generalize the
 WRS test but because of its capacity to support permutation tests of many
 sorts.

 If you are testing at all, then there would seem to be a likelihood (in
 the vague sense of consideration of possible goals of your testing proces)
 that you really would be interested in departures from equality of
 distribution that might have a more specific description, and might
 therefore be interested in testing strategies with more power, perhaps a
 compound test for differences in location and spread.

 --
 David.


 
  On Fri, May 29, 2015 at 1:32 PM, David Winsemius
  dwinsem...@comcast.net wrote:
 
  On May 29, 2015, at 9:31 AM, Wensui Liu wrote:
 
  Good morning, All
  I have a stat question not specifically related to the the programming
  language.
  To compare distributional consistency / discrepancy between two
  samples, we usually use kolmogorov-smirnov test, which is implemented
  in R with ks.test() or in SAS with pro npar1way edf.
  I am wondering if there is any alternative to KS test that could be
  generalized to K-samples.
 
  The 'coin' package (Hothorn, Hornick, van de Weil, and Zeileis)
  presents a variety of permutation and rank-based tests that would probably
  be more powerful than any multi-group variant of the KS test. The
  multi-group variant of the Wilcoxon Rank Sum Test presented in the 
  examples
  for the help page: ?wilcox_test is the Nemenyi-Damico-Wolfe-Dunn test.
 
  --
 
  David Winsemius
  Alameda, CA, USA
 
 
 
 
  --
  ==
  WenSui Liu
  Credit Risk Manager, 53 Bancorp
  wensui@53.com
  513-295-4370
  ==

 David Winsemius
 Alameda, CA, USA

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] alternatives to KS test applicable to K-samples

2015-05-30 Thread Wensui Liu
Thanks for your insight, David
But I am not interested in comparing means among multiple groups.
Instead, I want to compare empirical distributions. In this case, I am
not sure if wilcoxon should be still applicable.

still appreciate it.

On Fri, May 29, 2015 at 1:32 PM, David Winsemius dwinsem...@comcast.net wrote:

 On May 29, 2015, at 9:31 AM, Wensui Liu wrote:

 Good morning, All
 I have a stat question not specifically related to the the programming 
 language.
 To compare distributional consistency / discrepancy between two
 samples, we usually use kolmogorov-smirnov test, which is implemented
 in R with ks.test() or in SAS with pro npar1way edf.
 I am wondering if there is any alternative to KS test that could be
 generalized to K-samples.

 The 'coin' package (Hothorn, Hornick, van de Weil, and Zeileis) presents a 
 variety of permutation and rank-based tests that would probably be more 
 powerful than any multi-group variant of the KS test. The multi-group variant 
 of the Wilcoxon Rank Sum Test presented in the examples for the help page: 
 ?wilcox_test is the Nemenyi-Damico-Wolfe-Dunn test.

 --

 David Winsemius
 Alameda, CA, USA




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] alternatives to KS test applicable to K-samples

2015-05-29 Thread Wensui Liu
Good morning, All
I have a stat question not specifically related to the the programming language.
To compare distributional consistency / discrepancy between two
samples, we usually use kolmogorov-smirnov test, which is implemented
in R with ks.test() or in SAS with pro npar1way edf.
I am wondering if there is any alternative to KS test that could be
generalized to K-samples.

Thanks and have a nice weekend.

wensui

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] alternatives to KS test applicable to K-samples

2015-05-29 Thread Wensui Liu
Very nice, Brian

Sincerely appreciate your assistance!

On Friday, May 29, 2015, Cade, Brian ca...@usgs.gov wrote:

 Wensui:  There are the multi-response permutation procedures (MRPP) that
 readily test the omnibus hypothesis of no distributional differences among
 multiple samples for univariate or multivariate responses.  There also are
 empirical coverage tests that test a similar hypothesis among multiple
 samples but only for univariate responses.  Both are included in the USGS
 Blossom package for R linked here:
 https://www.fort.usgs.gov/products/23735 (not yet distributed via CRAN).
 The MRPP may also be available in other R packages on CRAN (vegan ?).

 Brian

 Brian S. Cade, PhD

 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818

 email:  ca...@usgs.gov
 javascript:_e(%7B%7D,'cvml','brian_c...@usgs.gov');
 tel:  970 226-9326


 On Fri, May 29, 2015 at 10:31 AM, Wensui Liu liuwen...@gmail.com
 javascript:_e(%7B%7D,'cvml','liuwen...@gmail.com'); wrote:

 Good morning, All
 I have a stat question not specifically related to the the programming
 language.
 To compare distributional consistency / discrepancy between two
 samples, we usually use kolmogorov-smirnov test, which is implemented
 in R with ks.test() or in SAS with pro npar1way edf.
 I am wondering if there is any alternative to KS test that could be
 generalized to K-samples.

 Thanks and have a nice weekend.

 wensui

 __
 R-help@r-project.org
 javascript:_e(%7B%7D,'cvml','R-help@r-project.org'); mailing list --
 To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading access file

2015-05-14 Thread Wensui Liu
mdbConnect-odbcConnectAccess(C:\\temp\\demo.mdb);
sqlTables(mdbConnect);
demo-sqlFetch(mdbConnect, tblDemo);
odbcClose(mdbConnect);
rm(demo);

On Thu, May 14, 2015 at 6:31 AM, silvano silv...@uel.br wrote:

 Hello everybody.

 I have a access file to read in R but I can’t to do this.

 I used Hmisc package, but it doesn’t work.

 Someone has the commands to read this kind of file?

 I attached the access file.

 Thanks.

 Silvano.
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] any way to write sas7bdat with R

2015-04-14 Thread Wensui Liu
I know R can read / write SAS data in xpt format and can also read SAS data
in sas7bdat format.

However, I am wondering if I can write sas7bdat with R.

thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Python

2015-03-01 Thread Wensui Liu
depending on what you want.
if you'd like to run r within python, there are 2 solutions as far as i've
known, either by rpys or by pyper.
here is a brief comparison i did before
https://statcompute.wordpress.com/2012/12/10/a-brief-comparison-between-rpy2-and-pyper/


On Sun, Mar 1, 2015 at 8:41 AM, linda.s samrobertsm...@gmail.com wrote:

 Is there any good example codes of integrating R and Python?
 Thanks.
 Linda

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cbind in a loop...better way? | summary

2014-10-09 Thread Wensui Liu
How about foreach() run in parallel?
On Oct 9, 2014 1:54 PM, David L Carlson dcarl...@tamu.edu wrote:

 Actually Jeff Laake's can be made even shorter with

 sapply(mat_list, as.vector)

 David C

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Evan Cooch
 Sent: Thursday, October 9, 2014 7:37 AM
 To: Evan Cooch; r-help@r-project.org
 Subject: Re: [R] cbind in a loop...better way? | summary

 Two solutions proposed -- not entirely orthogonal, but both do the
 trick. Instead of nesting cbin in a loop (as I did originally -- OP,
 below),

 1\   do.call(cbind, lapply(mat_list, as.vector))

 or

 2\   sapply(mat_list,function(x) as.vector(x))


 Both work fine. Thanks to Jeff Laake (2) + David Carlson (1) for their
 suggestions.


 On 10/8/2014 3:12 PM, Evan Cooch wrote:
  ...or some such. I'm trying to work up a function wherein the user
  passes a list of matrices to the function, which then (1) takes each
  matrix, (2) performs an operation to 'vectorize' the matrix (i.e.,
  given an (m x n) matrix x, this produces the vector Y of length  m*n
  that contains the columns of the matrix x, stacked below each other),
  and then (3) cbinds them together.
 
  Here is an example using the case where I know how many matrices I
  need to cbind together. For this example, 2 square (3x3) matrices:
 
   a - matrix(c,0,20,50,0.05,0,0,0,0.1,0),3,3,byrow=T)
   b - matrix(c(0,15,45,0.15,0,0,0,0.2,0),3,3,byrow=T)
 
  I want to vec them, and then cbind them together. So,
 
  result  - cbind(matrix(a,nr=9), matrix(b,nr=9))
 
  which yields the following:
 
[,1]  [,2]
   [1,]  0.00  0.00
   [2,]  0.05  0.15
   [3,]  0.00  0.00
   [4,] 20.00 15.00
   [5,]  0.00  0.00
   [6,]  0.10  0.20
   [7,] 50.00 45.00
   [8,]  0.00  0.00
   [9,]  0.00  0.00
 
  Easy enough. But, I want to put it in a function, where the number and
  dimensions  of the matrices is not specified. Something like
 
  Using matrices (a) and (b) from above, let
 
env - list(a,b).
 
  Now, a function (or attempt at same) to perform the desired operations:
 
vec=function(matlist) {
 
n_mat=length(matlist);
size_mat=dim(matlist[[1]])[1];
 
result=cbind()
 
 for (i in 1:n_mat) {
   result=cbind(result,matrix(matlist[[i]],nr=size_mat^2))
}
 
   return(result)
 
 }
 
 
  When I run vec(env), I get the *right answer*, but I am wondering if
  there is a *better* way to get there from here than the approach I use
  (above). I'm not so much interested in 'computational efficiency' as I
  am in stability, and flexibility.
 
  Thanks...
 
  .
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there an ID3 implementation in R?

2014-09-02 Thread Wensui Liu
Rweka
On Sep 2, 2014 11:04 AM, Tal Galili tal.gal...@gmail.com wrote:

 Dear R help mailing list,

 I am looking for an ID3 implementation in R. I know that there are many
 other decision tree algorithms already implemented (via rpart, tree, caret,
 C50, etc., etc.), but for research purposes I would like to reproduce the
 result of running ID3.

 I was not able to find such an implementation when searching in any of the
 following:
 http://rseek.org/
 http://finzi.psych.upenn.edu/search.html
 http://cran.r-project.org/web/views/MachineLearning.html

 Any suggestions?

 Thanks,
 Tal


 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)

 --

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rpy2 and user defined functions from R

2013-10-30 Thread Wensui Liu
if you don't need to exchange big data between r and python, pyper might be
better than rpy2.
On Oct 30, 2013 12:08 AM, Erin Hodgess erinm.hodg...@gmail.com wrote:

 Hello again!

 I'm using python with a module rpy2 to call functions from R.

 It works fine on built in R functions like rnorm.

 However, I would like to access user-defined functions as well.  For those
 of you who use this, I have:

 import rpy2.robjects as R
 x = R.r.buzz(3)
 R object as no attribute buzz

 (user defined function of buzz)

 This is on a Centos 5 machine with R-3.0.2 and python of 2.7.5.

 Thanks for any help.
 Sincerely,
 Erin



 --
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: erinm.hodg...@gmail.com

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cut into groups of equal nr of elements...

2013-07-18 Thread Wensui Liu
my fault. qcut() is a python function in pandas ;-)

what i meant is cut2() in hmisc.

sorry for messing up.
On Jul 18, 2013 11:51 AM, Greg Snow 538...@gmail.com wrote:

 Wensui,

 ?qcut on  my machine gives an error and ??qcut does not find anything in
 the installed packages.  Which package is qcut in?


 On Wed, Jul 17, 2013 at 4:43 PM, Wensui Liu liuwen...@gmail.com wrote:

 ?qcut
 On Jul 17, 2013 5:45 PM, Witold E Wolski wewol...@gmail.com wrote:

  I would like to cut a vector into groups of equal nr of elements.
  looking for a function on the lines of cut but where I can specify
  the size of the groups instead of the nr of groups.
 
 
 
 
  --
  Witold Eryk Wolski
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cut into groups of equal nr of elements...

2013-07-17 Thread Wensui Liu
?qcut
On Jul 17, 2013 5:45 PM, Witold E Wolski wewol...@gmail.com wrote:

 I would like to cut a vector into groups of equal nr of elements.
 looking for a function on the lines of cut but where I can specify
 the size of the groups instead of the nr of groups.




 --
 Witold Eryk Wolski

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Widows 8

2013-06-15 Thread Wensui Liu
confirmed.


On Sat, Jun 15, 2013 at 10:50 AM, Chet Seligman chet.selig...@gmail.comwrote:

 Can anyone confirm that R runs on Widows 8?

 Thanks,
 Chet Seligman

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest, Giving More Importance to Some Data

2013-03-24 Thread Wensui Liu
your question doesn't seem to specifically related to either R or random
forest. instead, it is about how to assign weights to training
observations.


On Sun, Mar 24, 2013 at 6:43 AM, Lorenzo Isella lorenzo.ise...@gmail.comwrote:

 Dear All,
 I am using randomForest to predict the final selling price of some items.
 As it often happens, I have a lot of (noisy) historical data, but the
 question is not so much about data cleaning.
 The dataset for which I need to carry out some predictions are fairly
 recent sales or even some sales that will took place in the near future.
 As a consequence, historical data should be somehow weighted: the older
 they are, the less they should matter for the prediction.
 Any idea about how this could be achieved?
 Please find below a snippet showing how I use the randomForest library (on
 a multi-core machine).
 Any suggestion is appreciated.
 Cheers

 Lorenzo

 ##**##**
 ###
 rf_model - foreach(iteration=1:cores,
  ntree = rep(50, 4),
  .combine = combine,
  .packages = randomForest) %dopar%{
sink(log.txt, append=TRUE)
cat(paste(Starting iteration,iteration,\n))
randomForest(trainRF,
prices_train,   ## mtry=20,
   nodesize=5,
   ## maxnodes=140,
  importance=FALSE, do.trace=10,ntree=ntree)
 ##**##**
 ###

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fractional logit in GLM?

2013-02-03 Thread Wensui Liu
glm() will handle fractional logit with some tweaks. below is copied from
my blog in a python example. however, you should be able to see the R code
from it.

In [12]: # Address the same type of model with R by Pyper

In [13]: import pyper as pr

In [14]: r = pr.R(use_pandas = True)

In [15]: r.r_data = data

In [16]: # Indirect Estimation of Discrete Dependent Variable Models

In [17]: r('data - rbind(cbind(r_data, y = 1, wt = r_data$LEV_LT3),
cbind(r_data, y = 0, wt = 1 - r_data$LEV_LT3))')
Out[17]: 'try({data - rbind(cbind(r_data, y = 1, wt =
r_data$LEV_LT3), cbind(r_data, y = 0, wt = 1 - r_data$LEV_LT3))})\n'

In [18]: r('mod - glm(y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A,
weights = wt, subset = (wt  0), data = data, family = binomial)')
Out[18]: 'try({mod - glm(y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A,
weights = wt, subset = (wt  0), data = data, family =
binomial)})\nWarning message:\nIn eval(expr, envir, enclos) :
non-integer #successes in a binomial glm!\n'

In [19]: print r('summary(mod)')
try({summary(mod)})

Call:
glm(formula = y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A, family = binomial,
data = data, weights = wt, subset = (wt  0))

Deviance Residuals:
Min   1Q   Median   3Q  Max
-1.0129  -0.4483  -0.3173  -0.1535   2.5379

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -7.249790.56734 -12.779   2e-16 ***
COLLAT1  1.237150.26012   4.756 1.97e-06 ***
SIZE10.359010.03746   9.584   2e-16 ***
PROF2   -3.143130.73895  -4.254 2.10e-05 ***
LIQ -1.382490.35749  -3.867  0.00011 ***
IND3A0.546580.14136   3.867  0.00011 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 2692.0  on 5536  degrees of freedom
Residual deviance: 2456.4  on 5531  degrees of freedom
AIC: 1995.4

Number of Fisher Scoring iterations: 6



On Sun, Feb 3, 2013 at 11:17 AM, Rachael Garrett
rachaeldgarr...@gmail.comwrote:

 Hi,

 Does anyone know of a function in R that can handle a fractional variable
 as the dependent variable?  The catch is that the function has to be
 inclusive of 0 and 1, which betareg() does not.

 It seems like GLM might be able to handle the fractional logit model, but
 I can't figure it out.  How do you format GLM to do so?

 Best,

 Rachael



 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] importing large datasets in R

2013-01-19 Thread Wensui Liu
take a look at ff package
On Jan 19, 2013 7:04 AM, gaurav singh gauravonlin...@gmail.com wrote:

 Hi Everyone,

 I am a little new to R and the first problem I am facing is the dilemma
 whether R is suitable for files of size 2 GB's and slightly more then 2
 Million rows. When I try importing the data using read.table, it seems to
 take forever and I have to cancel the command. Are there any special
 techniques or methods which i can use or some tricks of the game that I
 should keep in mind in order to be able to do data analysis on such large
 files using R?

 --
 Regards
 Gaurav Singh

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Naming an object after another object...can it be done?

2013-01-17 Thread Wensui Liu
are you looking for assign()?
On Jan 17, 2013 1:56 PM, mtb...@gmail.com wrote:

 Hello R-helpers,

 I have run the following line of code:

 x-dat$col

 and now I would like to assign names(x) to be dat$col (e.g., a character
 string equal to the column name that I assigned to x).

 What I am trying to do is to assign columns in my dataframe to new objects
 called x and y. Then I will use x and y within a new function to make plots
 with informative axis labels (e.g., dat$col instead of x. So, for
 example, I would like to plot (y~x,xlab=names(x)) and have dat$col
 printed in the x-axis label. I can do this all manually, by typing

 names(x)- dat$col)

 but I'd like to do it with non-specific code within my function so I don't
 have to type the variable names manually each time.

 Many thanks,

 Mark Na

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate / collapse big data frame efficiently

2012-12-25 Thread Wensui Liu
aggregate() is not efficient. try by().


On Tue, Dec 25, 2012 at 11:34 AM, Martin Batholdy
batho...@googlemail.comwrote:

 Hi,


 I need to aggregate rows of a data.frame by computing the mean for rows
 with the same factor-level on one factor-variable;

 here is the sample code:


 x - data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

 aggregate(x, list(x[,1]), mean)


 Now my problem is, that the actual data-set is much bigger (120 rows and
 approximately 100.000 columns) – and it takes very very long (actually at
 some point I just stopped it).

 Is there anything that can be done to make the aggregate routine more
 efficient?
 Or is there a different approach that would work faster?


 Thanks for any suggestions!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROC Curve: negative AUC

2012-11-22 Thread Wensui Liu
wrong direction for ranking

On Thu, Nov 22, 2012 at 1:58 PM, brunosm brunos...@gmail.com wrote:

 the area under the curve (AUC) is negative?

 I'm using ROC function with a logistic regression, package Epi.

 First time it happens...

 Thanks a lot!

 Bruno




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Poisson Regression: questions about tests of assumptions

2012-10-14 Thread Wensui Liu
just a side note for your 4th question.

for a small sample, clarke test instead of vuong test might be more
appropriate and the calculation is so simple that even excel can
handle it :-)

On Sun, Oct 14, 2012 at 12:00 PM, Eiko Fried tor...@gmail.com wrote:
 I would like to test in R what regression fits my data best. My dependent
 variable is a count, and has a lot of zeros.

 And I would need some help to determine what model and family to use
 (poisson or quasipoisson, or zero-inflated poisson regression), and how to
 test the assumptions.

 1) Poisson Regression: as far as I understand, the strong assumption is
 that dependent variable mean = variance. How do you test this? How close
 together do they have to be? Are unconditional or conditional mean and
 variance used for this? What do I do if this assumption does not hold?

 2) I read that if variance is greater than mean we have overdispersion, and
 a potential way to deal with this is including more independent variables,
 or family=quasipoisson. Does this distribution have any other requirements
 or assumptions? What test do I use to see whether 1) or 2) fits better -
 simply anova(m1,m2)?

 3) I also read that negative-binomial distribution can be used when
 overdispersion appears. How do I do this in R? What is the difference to
 quasipoisson?

 4) Zero-inflated Poisson Regression: I read that using the vuong test
 checks what models fits better.
 vuong (model.poisson, model.zero.poisson)
 Is that correct?

 5) ats.ucla.edu has a section about zero-inflated Poisson Regressions, and
 test the zeroinflated model (a) against the standard poisson model (b):
 m.a - zeroinfl(count ~ child + camper | persons, data = zinb)
 m.b - glm(count ~ child + camper, family = poisson, data = zinb)
 vuong(m.a, m.b)
 I don't understand what the | persons part of the first model does, and
 why you can compare these models if. I had expected the regression to be
 the same and just use a different family.

 Thank you
 T

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] input arguments in a file into rscript

2012-10-13 Thread Wensui Liu
dear listers,

with rscript, i know how to feed arguments by command line directly,
e.g. Rscript test.r 10 20. however, if i saved all arguments in a
file, how do i make rscript to take arguments in this file?  Rscript
test.r  input.txt doesn't seem to work for me.
(ps: i am using windows)

thanks so much!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to invoke vba by r?

2011-10-18 Thread Wensui Liu
dear listers,

right now, we are trying to use r to implement sas dde function, e.g.
interact with excel. however, we can't find a way to call vba from r?

any insight is appreciated.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a senior level statistician opening

2011-08-11 Thread Wensui Liu
it is in the consumer risk modeling team in Cincinnati ohio.

pls send resume to wensui@53.com if interested.

thx

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] recommendation on r scripting tutorial?

2011-04-02 Thread Wensui Liu
Good morning, dear listers

I am wondering if you could recommend a good tutorial / book for r scripting.

thank you so much in advance!

WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] string evaluation

2011-03-12 Thread Wensui Liu
Good morning, dear listers

I am wondering how to do string evaluation such that

model - glm(Y ~ [STRING], data = mydata) where STRING - x1 + x2 + x3

It is very doable in other language such as SAS.

Thank you so much for your insight!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] string evaluation

2011-03-12 Thread Wensui Liu
Thank you so much, David. Your solution exactly suits my need.

formula() seems the key.

appreciate your help!

On Sat, Mar 12, 2011 at 10:22 AM, David Winsemius
dwinsem...@comcast.net wrote:

 On Mar 12, 2011, at 10:10 AM, Wensui Liu wrote:

 Good morning, dear listers

 I am wondering how to do string evaluation such that

 model - glm(Y ~ [STRING], data = mydata) where STRING - x1 + x2 + x3

 It is very doable in other language such as SAS.

 Also very doable in R. You need to understand that R is a bit more
 structured than SAS, which is really just a macro-processor at least by
 heritage. Formulas are a language class in R and not just character vectors,
  so you need to construct them outside the regression functions.

 STRING - x1 + x2 + x3
 form - formula(paste(Y ~ , STRING) )
  form
 # Y ~ x1 + x2 + x3
  class(form)
 #[1] formula

  model - glm(form, data = mydata)

 --
 David Winsemius, MD
 West Hartford, CT





-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R on 32-bit ubuntu with PAE enabled

2010-10-26 Thread Wensui Liu
morning, all
right now, I have R installed on a 32-bit ubuntu with PAE enabled. And
I can see more than 4-g memory available in system monitor. my
question is: might this 32-bit R take advantage of the extra memory
and handle large data?

thank you so much!
wensui

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [OT] language for data munging

2010-10-16 Thread Wensui Liu
dear all

i think i am able to get an unbiased opinion from computing experts
here other than python or perl list.

the question is: which language, perl or python in particular, is
better for data munging (manage and manipulate large-size data /
interact with DB / pre-process data before statistical modeling per my
definition)?

thank you so much for your insight!
wensui

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decision Tree in Python or C++?

2010-09-04 Thread Wensui Liu
for python, please check
http://onlamp.com/pub/a/python/2006/02/09/ai_decision_trees.html

On Sat, Sep 4, 2010 at 11:21 AM, noclue_ tim@netzero.net wrote:


 Have anybody used Decision Tree in Python or C++?  (or written their own
 decision tree implementation in Python or C++)?  My goal is to run decision
 tree on 8 million obs as training set and score 7 million in test set.

 I am testing 'rpart' package on a 64-bit-Linux + 64-bit-R environment. But
 it seems that rpart is either not stable or running out of memory very
 quickly. (Is it because R is passing everything as copy instead of as object
 reference?)

 Any idea would be greatly appreciated!

 Have a nice weekend!
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Decision-Tree-in-Python-or-C-tp2526810p2526810.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
wens...@paypal.com
statcompute.spaces.live.com
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Connecting to MS Access database

2010-07-18 Thread Wensui Liu
Hey Xin
this is a piece I copied from my blog.

library(RODBC)
mdbConnect-odbcConnectAccess(C:\\temp\\demo.mdb)
sqlTables(mdbConnect)
demo-sqlFetch(mdbConnect, tblDemo)
odbcClose(mdbConnect)
rm(demo)


On Mon, Jul 19, 2010 at 12:05 AM, Xin Ge xingemaill...@gmail.com wrote:
 Hi All,

 Can anyone please suggest me from where should I start to learn about 'how
 to connect to access db' ?

 How if someone has some written code and I can go over that to understand
 and make necessary changes... any help would be highly appreciated,

 --
 Xin Ge.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
wens...@paypal.com
statcompute.spaces.live.com
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R vs SAS and Revolution R

2010-06-19 Thread Wensui Liu
it really depends on how you define large dataset.

In a corporate production environment, it is not unusual to do data
manipulation for x-G dataset. In this case, SAS might be preferred
from my personal experience.

On Sat, Jun 19, 2010 at 9:39 AM, skan juanp...@gmail.com wrote:

 Hello

 How do you compare R to SAS in terms of speed and management of large
 datasets?

 What about Revolution R?
 I've seen on their site, they claim that Revolution R is much faster than R
 and it's multithread...
 Can you really notice the difference?. What dissadvantage does it have?
 I think it's based on R 2.10.   but R  already issued the version 2.12


 Regards



 What alternative to R would you use in order to merge asynchronus time
 series?. SAS, Stata, eViews...?
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/R-vs-SAS-and-Revolution-R-tp2261149p2261149.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
wens...@paypal.com
statcompute.spaces.live.com
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Machine Learning and R

2010-05-08 Thread Wensui Liu
good question!
if there is such a book, i'd also like to read as well.

On Sun, May 9, 2010 at 7:41 AM, Ralf B ralf.bie...@gmail.com wrote:
 Hi all,

 I am looking for a good book that covers Machine Learning as a whole
 and provides examples in R while not over focusing on the math (such
 as in 'Elements of Statistical Learning') but rather on descriptions
 and examples. I am relatively new to R and ML and, while solving
 problems with R, I want to learn the main concepts, techniques and
 problem categories. Can anybody here recommend good books? Does
 anybody know a site that lists good books about R?

 Ralf

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
wens...@paypal.com
statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqlite and r

2010-04-18 Thread Wensui Liu
thanks so much for your reply, Gabor!
actually, my intention is to rsqlite to submit sql into sqlite db from
r and utilize temp tables in sqlite to store the working tables. in
this way, there is not much computing burden and memory consumption in
r.
however, the functions natively supported in sqlite are so limited and
sometimes the data is needed to transferred between r and sqlite back
and forth to get the final job done.

again, appreciate your help, Gabor. by the way, i really like your
sqldf package, wonderful work!

On Sat, Apr 17, 2010 at 8:15 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 1. There is no off the shelf facility although SQLite itself allows
 you to write C functions and those presumably could call R but you
 would have to do it yourself.

 2. There are also some solutions discussed here which might be good
 enough and are a lot easier than #1:
 http://code.google.com/p/sqldf/#3._Why_does_sqldf(select_var(x)_from_DF)_not_work?

 3. Also you could use a different database.   For example, sqldf also
 allows you to use H2 or PostgreSQL in place of SQLite and they have
 many more functions than SQLite.   See:
 http://sqldf.googlecode.com


 On Sat, Apr 17, 2010 at 8:04 PM, Wensui Liu liuwen...@gmail.com wrote:
 have used both for a while and feel they are like pea and carrot together.
 it is extremely handy to use sqlite engine for heavy data management
 from r instead of using r directly.
 i am also wondering  if i could define and register sqlite functions
 within r in the way similar to how we do in python. if this is doable
 in r, that will be perfect. any insight?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sqlite and r

2010-04-17 Thread Wensui Liu
have used both for a while and feel they are like pea and carrot together.
it is extremely handy to use sqlite engine for heavy data management
from r instead of using r directly.
i am also wondering  if i could define and register sqlite functions
within r in the way similar to how we do in python. if this is doable
in r, that will be perfect. any insight?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add header line to large text file

2010-04-15 Thread Wensui Liu
if i were you, i probably will use 1 line of sed to do such task
instead of R to insert headers in the file.

On Thu, Apr 15, 2010 at 10:34 AM, Zev Ross z...@zevross.com wrote:
 All,

 I have a 30 million record text file without header information. I would
 like to add a header to this file without reading it first. Is this
 possible? The code below does what I want except that the readLines portion
 takes quite a long time. Is there a way around reading the lines? I'm
 working on Windows XP.

 Zev

 input-readLines(c:/junk/forR.csv)
 input-c(c('a, b, c, d, e, f'), input)
 writeLines(input, c:/junk/forRfix.csv)

 --
 Zev Ross
 ZevRoss Spatial Analysis
 120 N Aurora, Suite 3A
 Ithaca, NY 14850
 607-277-0004 (phone)
 866-877-3690 (fax, toll-free)
 z...@zevross.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Artificial Neural Networks

2010-01-03 Thread Wensui Liu
1) there are plenty of data for neural net testing in R. you might
check datasets package on CRAN.
2) which neural net are you talking about, BP, RBF, LVQ, or something
else. the world of neural nets is pretty much like a zoo. without
knowing which animal you are talking about, nobody can help you.

On Sun, Jan 3, 2010 at 3:54 PM, Alex Olafson alex.olaf...@yahoo.com wrote:
 Hi! I am studying to use some R libraries which are applied for working
 with artificial neural neworks (amore, nnet). Can you recommend some
 useful, reliable and easy to get example data to use in R for creating
 and testing a neural network? And what library will you advise?



      __
 The new Internet Explorer® 8 - Faster, safer, easier.  Optimized for Yaho
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Finance - EAD, LGD, PD

2009-12-27 Thread Wensui Liu
i think rick's questions are more related to basel II instead of R and don't
think there is such a R package.
per my limited knowledge, there are many ways to calculate PD, EAD, and LGD,
either on portfolio level or on account level. So it really depends on how
you are going to estimate them. On the side of consumer credit risk, it
makes more sense to estimate 3 models on the account level, which should be
under the umbrella of GLM. While PD / LGD are well studied, EAD is not.
There are multiple ways to estimate EAD, such as LEQ/CCF/EADF, depending on
the characteristic of accounts.

2009/12/27 Cedrick W. Johnson cedr...@cedrickjohnson.com

 Howdy-

 You may want to check out the R-sig-finance list and search through the
 postings here:
 http://n4.nabble.com/Rmetrics-f925806.html

 There's quite a few packages in the CRAN taskviews as well:

 http://cran.r-project.org/web/views/Finance.html

 -cj



 Ricardo Gonçalves Silva wrote:

 Hi,

 I'm currently beginning to use R for financial analysis (mainly Basel II
 benchmarks) and I would like to know if any R-User can give me some initial
 directions on packages and tutorials which I can use to calculate capital
 requirements, default probabilities, and related stuff.

 Thanks in advance,

 Rick
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] something similar to %include() in sas?

2009-12-26 Thread Wensui Liu
i am just wondering if there is an effective way to include other external
codes into the program.

thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] something similar to %include() in sas?

2009-12-26 Thread Wensui Liu
thanks, all,
i think source() is the right one i am looking for.

On Sat, Dec 26, 2009 at 3:13 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Dec 26, 2009, at 3:10 PM, Wensui Liu wrote:

  i am just wondering if there is an effective way to include other external
 codes into the program.


 ?source


  --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DROP and KEEP statements in R

2009-12-19 Thread Wensui Liu
drop example.
 data(iris)
 summary(iris)
  Sepal.LengthSepal.Width Petal.LengthPetal.Width
 Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100
 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300
 Median :5.800   Median :3.000   Median :4.350   Median :1.300
 Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199
 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800
 Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500
   Species
 setosa:50
 versicolor:50
 virginica :50

 iris$Species - NULL
 summary(iris)
  Sepal.LengthSepal.Width Petal.LengthPetal.Width
 Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100
 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300
 Median :5.800   Median :3.000   Median :4.350   Median :1.300
 Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199
 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800
 Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500

On Sat, Dec 19, 2009 at 3:21 PM, sarjin...@yahoo.com wrote:

 What is equivalent to DROP or KEEP statements of SAS in R?

 --
 This message was sent on behalf of sarjin...@yahoo.com at
 openSubscriber.com
 http://www.opensubscriber.com/messages/r-help@r-project.org/topic.html

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting Frequencies

2009-12-09 Thread Wensui Liu
 x - runif(10, 0, 1)
 x2 - x  0.5
 x2
 [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE
 table(x2)
x2
FALSE  TRUE
5 5


On Wed, Dec 9, 2009 at 6:36 PM, BIGBEEF martin.beze...@gmail.com wrote:


 Hi - I'm having difficulty with frequencies in R. I have a table with a
 variable (column) called difference 600 observations (rows). I would like
 to know how many values are  -0.5 as well as how many are  0.5. The rest
 are obviously in the middle.

 In SAS I could this immediately but am unable to do it in R.

 Thanks for your help.
 --
 View this message in context:
 http://n4.nabble.com/Counting-Frequencies-tp956556p956556.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS datalines or cards statement equivalent in R?

2009-12-06 Thread Wensui Liu
Gary,
if i were you, i would use scan().
here is a piece of code.

# DO DATA INPUT IN R CONSOLE WITH SCAN()   #
#--#
# COMPARABLE SAS CODE: #
#--#
# data test;   #
#   input x1 x2 x3 y $ @@; #
# cards;   #
# 71 . 3 0 158 14 3 0  #
# 128 5 4 1#
# ;#
# run; #


test *-* *data.frame*(scan(file = ,
what = *list*(x1 = 0, x2 = 0, x3 = 0, y = )))
71 NA 3 0 158 14 3 0
128 5 4 1

On Sat, Dec 5, 2009 at 8:11 PM, Gary Miller mail2garymil...@gmail.comwrote:

 Hi R Users,

 Is there a equivalent command in R where I can read in raw data? For
 example
 I'm looking for equivalent R code for following SAS code:

 DATA survey;
   INPUT id sex $ age inc r1 r2 r3 ;
   DATALINES;
  1  F  35 17  7 2 2
 17  M  50 14  5 5 3
 33  F  45  6  7 2 7
 49  M  24 14  7 5 7
 65  F  52  9  4 7 7
 81  M  44 11  7 7 7
 2   F  34 17  6 5 3
 18  M  40 14  7 5 2
 34  F  47  6  6 5 6
 50  M  35 17  5 7 5
 ;

 Any help would be highly appreciated,
 Gary

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] r package for Probabilistic neural networks?

2009-10-11 Thread Wensui Liu
I am wondering if there is an implementation of PNN by Specht in R.
thank you so much!

-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read time series

2009-09-20 Thread Wensui Liu
zoo()

On Sun, Sep 20, 2009 at 12:24 PM, Alexis Maluendas
avmaluend...@gmail.com wrote:
 Hi R experts,

 How can I get a ts object from a data frame object which contains a daily
 time series in order to apply it time series functions?

 Tanks

 Aleto

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running many different regressions

2009-09-20 Thread Wensui Liu
apologize,
there is a typo in the glm() :-)

On Sun, Sep 20, 2009 at 2:05 PM, Georg Ehret georgeh...@gmail.com wrote:
 Dear R community,
   I have a dataframe with say 100 different variables. I wish to regress
 variable 1 separately on every other variable (2-100) in a linear regression
 using lm. There must be an easy way to do this without loops, but I have
 difficulties figuring this out... Can you please help?
 Thank you and best regards, Georg.
 *
 Georg Ehret
 Johns Hopkins University
 Institute of Genetic Medicine

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running many different regressions

2009-09-20 Thread Wensui Liu
well, i assume you understand what my code does.
please don't use if you don't know what you are using.

On Sun, Sep 20, 2009 at 2:44 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Sun, Sep 20, 2009 at 2:38 PM, Wensui Liu liuwen...@gmail.com wrote:
 I just quickly draft one with boston housing data. and it should be
 close to what you need.

 # REMOVE ALL OBJECTS
 rm...

 WARNING!!!

 Running the code in this post could wipe out your entire workspace

 Please do NOT post such code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running many different regressions

2009-09-20 Thread Wensui Liu
should chicken be blamed by the people allergic by eggs?

On Sun, Sep 20, 2009 at 3:00 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 Not everyone carefully examines the code from r-help posts
 prior to pasting it in.  Posting code is very dangerous and
 should not be done.

 Quite the contrary they often try to understand the
 code by running it.

 Code like this should never be posted.



 On Sun, Sep 20, 2009 at 2:47 PM, Wensui Liu liuwen...@gmail.com wrote:
 well, i assume you understand what my code does.
 please don't use if you don't know what you are using.

 On Sun, Sep 20, 2009 at 2:44 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Sun, Sep 20, 2009 at 2:38 PM, Wensui Liu liuwen...@gmail.com wrote:
 I just quickly draft one with boston housing data. and it should be
 close to what you need.

 # REMOVE ALL OBJECTS
 rm...

 WARNING!!!

 Running the code in this post could wipe out your entire workspace

 Please do NOT post such code.




 --
 ==
 WenSui Liu
 Blog   : statcompute.spaces.live.com
 Tough Times Never Last. But Tough People Do.  - Robert Schuller
 ==





-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to categorize continuous variable when useing regression

2009-09-16 Thread Wensui Liu
there are so many ways to categorize the continuous variable in practice,
such as knowledge-based, percentile-based, or model-based (i.e. regression
tree).

On Wed, Sep 16, 2009 at 9:41 PM, Manli Yan manliyanrh...@gmail.com wrote:

  assume dependent variable y( continuous),independent variable x (
 continuous),I try to  categorize x with some interval,such that,those
 intervals would has most significant different effect on y.
   any one knows which method I should apply,I know it will cause the loss
 of information,but can I really do that?or by using what mehod ,I will keep
 the loss minimal,all I want just some key words,thanks in advance~

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best option for exporting data frame to SPSS?

2009-09-02 Thread Wensui Liu
if i were you, i will use sqlite, as mentioned in your email already.
spss should be able to read sqlite data easily through odbc.

On Tue, Sep 1, 2009 at 11:11 AM, Fredrik Karlsson dargo...@gmail.comwrote:

 Dear list,

 I am leaving my old position and now need to convert my R data frames
 into a format that can be used by an SPSS user replacing me, without
 running into conversion problems.
 The data set consists of strings in UTF8 encoding and values in double
 precision floats. The data set is not terribly large, but I had bit
 problems getting it into R due to the large number of unfortunate
 characters in the strings (', #,  and so on) so I was just wondering
 if there is any way to get the data into a SPSS friendly format (other
 than tab-separated files) so that a minimum of conversion is done in
 between the two systems.
 A data base file (SQLite) would be ideal, but unfortunatelly, I don't
 think the recieving end would be able to handle it, i.e. get the data
 into SPSS.

 Sorry for asking this on the list, but I have found lots of
 information about getting data safelly _into_ R in the archive, but
 far les about exporting data out of R.

 Please give me your best tip.

 /Fredrik

  --
 Life is like a trumpet - if you don't put anything into it, you don't
 get anything out of it.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best R text editors?

2009-08-30 Thread Wensui Liu
emacs + ess in windows is just as powerful as in linux.
emacs is the only programming editor i would ever need.

On Sun, Aug 30, 2009 at 11:37 AM, Rodrigo Aluizio r.alui...@gmail.comwrote:

 Well, on Linux = Emacs+ Ess
 On Windows = Tinn-R

 -
 MSc. Rodrigo Aluizio
 Centro de Estudos do Mar/UFPR
 Laboratório de Micropaleontologia

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CHAID in R

2009-08-28 Thread Wensui Liu
well, are you sure if party() can implement chaid?
i doubt if chaid is being implemented in any R package.

On Fri, Aug 28, 2009 at 7:50 AM, Arup arup.pramani...@gmail.com wrote:


 Hi..I am trying to run CHAID in R..I have installed the sofyware Party and
 trying to use the function ctree() to carry out the analysis. but I am
 getting the following message Error in terms.default(formula, data = data)
 :
 no terms component
 . I am having some Likert scale variable where I have variables like
 Overall satisfaction(Dependent Variable),Product quality, Brand
 image,Warranty(Independent variable) etc.. Now can anyone tell me how to
 run
 CHAID in this case..what would be the formula? Thanks in Advance..
 --
 View this message in context:
 http://www.nabble.com/CHAID-in-R-tp25188573p25188573.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CHAID in R

2009-08-28 Thread Wensui Liu
i couldn't find it on CRAN.

On Fri, Aug 28, 2009 at 5:16 PM, Max Kuhn mxk...@gmail.com wrote:

  well, are you sure if party() can implement chaid?
  i doubt if chaid is being implemented in any R package.

 https://r-forge.r-project.org/projects/chaid/


 --

 Max




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] good and bad ways to import fixed column data (rpy)

2009-08-16 Thread Wensui Liu
Gabor made a good point.
Here is an example I copied from my blog.

##
# READ FIXED-WIDTH DATA FILE WITH read.fwf() #
# -- #
# EQUIVALENT SAS CODE:   #
# filename data 'E:\sas\fixed.txt';  #
# data test; #
#   infile data truncover;   #
#   input @1 city $ 1 - 22 @23 population;   #
# run;   #
##

# OPEN A CONNECTION TO THE DATA FILE
data *-* file(description = e:\\sas\\fixed.txt, open = r)

# width = c(...)  == SPECIFIES COLUMN WIDTHS
# col.names = c(...)  == GIVES COLUMN NAMES
# colClasses = c(...) == DEFINES COLUMN CLASSES
test *-* read.fwf(data, header = FALSE, width = c(22, 10),
 col.names = c(city, population),
 colClasses = c(character, numeric))

close(data)

On Sun, Aug 16, 2009 at 6:36 PM, Gabor Grothendieckggrothendi...@gmail.com
wrote:
 Check out ?read.fwf

 On Sun, Aug 16, 2009 at 4:49 PM, Ross Boylanr...@biostat.ucsf.edu wrote:
 Recorded here so others may avoid my mistakes.

 I have a bunch of files containing fixed width data.  The R Data guide
 suggests that one pre-process them with a script if they are large.
 They were 50MG and up, and I needed to process another file that gave
 the layout of the lines anyway.

 I tried rpy to not only preprocess but create the R data object in one
 go.  It seemed like a good idea; it wasn't.  The core operation, was to
 build up a string for each line that looked like data.frame(var1=val1,
 var2=val2, [etc]) and then rbind this to the data.frame so far.  I did
 this with r(mycommand string). Almost all the values were numeric.

 This was incredibly slow, being unable to complete after running
 overnight.

 So, the lesson is, don't do that!

 I switched to preprocessing that created a csv file, and then read.csv
 from R.  This worked in under a minute.  The result had dimension 150913
 x 129.

 The good news in rpy was that I found objects persisted across calls to
 the r object.

 Exactly why this was so slow I don't know.  The two obvious suspects the
 speed of rbind, which I think is pretty inefficient, and the overhead of
 crossing the python/R boundary.

 This was on Debian Lenny:
 python-rpy1.0.3-2
 Python 2.5.2
 R 2.7.1

 rpy2 is not available in Lenny, though it is in development versions of
 Debian.

 Ross Boylan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare lm() to glm(family=poisson)

2009-07-31 Thread Wensui Liu
i don't understand how you can fit a poisson model with lm() function.
otherwise, how could you compare lm() with glm(...family=poisson)?

On Fri, Jul 31, 2009 at 7:41 PM, Mark Namtb...@gmail.com wrote:
 Dear R-helpers,
 I would like to compare the fit of two models, one of which I fit using lm()
 and the other using glm(family=poisson). The latter doesn't provide
 r-squared, so I wonder how to go about comparing these
 models (they have the same formula).

 Thanks very much,

 Mark Na

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] useR in cincy area?

2009-07-04 Thread Wensui Liu
Dear Folks,
Jim Holtman and I are wondering if there is a useR group in cincy area
(OH in USA) or not. If not, how many on this list in cincy area would
like to have a useR group in cincy area?
Thank you so much!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] batch submit

2009-06-19 Thread Wensui Liu
with emacs + ess, I can do batch submit sas code using m-x submit sas.
wondering if I can do so for r or not.
-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recursive partitioning algorithms in R vs. alia

2009-06-19 Thread Wensui Liu
in terms of the richness of features and ability to handle large
data(which is normal in bank), SAS EM should be on top of others.
however, it is not cheap.
in terms of algorithm, split procedure in sas em can do
chaid/cart/c4.5, if i remember correctly.

On Fri, Jun 19, 2009 at 2:35 PM, Carlos J. Gil
Bellostac...@datanalytics.com wrote:
 Dear R-helpers,

 I had a conversation with a guy working in a business intelligence
 department at a major Spanish bank. They rely on recursive partitioning
 methods to rank customers according to certain criteria.

 They use both SAS EM and Salford Systems' CART. I have used package R
 part in the past, but I could not provide any kind of feature comparison
 or the like as I have no access to any installation of the first two
 proprietary products.

 Has anybody experience with them? Is there any public benchmark
 available? Is there any very good --although solely technical-- reason
 to pay hefty software licences? How would the algorithms implemented in
 rpart compare to those in SAS and/or CART?

 Best regards,

 Carlos J. Gil Bellosta
 http://www.datanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recursive partitioning algorithms in R vs. alia

2009-06-19 Thread Wensui Liu
well, how difficult to code random forest with sas macro + proc split?
if you are lack of sas programming skill, then you are correct that
you have to wait for 8 years :-)
i don't know how much sas experience you have. as far as i know, both
bagging and boosting have been implemented in sas em for a while,
together with other cut-edge modeling tools such as svm / nnet.


On Fri, Jun 19, 2009 at 4:18 PM, Tobias
Verbeketobias.verb...@openanalytics.be wrote:
 Wensui Liu wrote:

 in terms of the richness of features and ability to handle large
 data(which is normal in bank), SAS EM should be on top of others.

 Should be ? That is not at all my experience.
 SAS EM is very much lagging behind current
 research. You will find variants of random forests
 in R that will not be in SAS for the next 8 years,
 to give just one example.

 however, it is not cheap.
 in terms of algorithm, split procedure in sas em can do
 chaid/cart/c4.5, if i remember correctly.

 These are techniques of the 80s and 90s
 (which proves my point). CART is in rpart and
 an implementation of C4.5 can be accessed
 through RWeka. For the oldest one (CHAID, 1980),
 there might be an implementation soon:

 http://r-forge.r-project.org/projects/chaid/

 but again there have been quite some improvements
 in the last decade as well:

 http://cran.r-project.org/web/views/MachineLearning.html

 HTH,
 Tobias

 On Fri, Jun 19, 2009 at 2:35 PM, Carlos J. Gil
 Bellostac...@datanalytics.com wrote:

 Dear R-helpers,

 I had a conversation with a guy working in a business intelligence
 department at a major Spanish bank. They rely on recursive partitioning
 methods to rank customers according to certain criteria.

 They use both SAS EM and Salford Systems' CART. I have used package R
 part in the past, but I could not provide any kind of feature comparison
 or the like as I have no access to any installation of the first two
 proprietary products.

 Has anybody experience with them? Is there any public benchmark
 available? Is there any very good --although solely technical-- reason
 to pay hefty software licences? How would the algorithms implemented in
 rpart compare to those in SAS and/or CART?

 Best regards,

 Carlos J. Gil Bellosta
 http://www.datanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.









-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I'm offering $300 for someone who know R-programming to do the assignments for me.

2009-05-09 Thread Wensui Liu
my guess he might ask for production code but just didn't want to tell
the truth here.
in some software forums, this kind of things happen all the time :-)

On Fri, May 8, 2009 at 12:36 PM, Wacek Kusnierczyk
waclaw.marcin.kusnierc...@idi.ntnu.no wrote:
 Simon Pickett wrote:
 I bet at least a few people offered their services! It might be an
 undercover sting operation to weed out the unethical amongst us :-)


 ... written by some of the r core developers?

 vQ

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] is there a way to read a specific column from a txt file

2009-05-03 Thread Wensui Liu
Sometimes, it is too costly to read the whole data file into R.
I am looking for solution in scan() and read.Lines() but don't they work.
Thank you so much!
-- 
==
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R on Nokia N810

2009-04-24 Thread Wensui Liu
Just out of curiosity.
Has anyone had R installed on N810 successfully?
Have a nice weekend.

-- 
==
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pruning trees in a Random Forest

2009-03-20 Thread Wensui Liu
I don't think it necessary to prune trees in RF, per brieman's paper.

On 3/20/09, Liaw, Andy andy_l...@merck.com wrote:
 The way the trees are structured in randomForest, there's no way to stop
 tree growth by depth (what you called level).

 (If anyone has ideas, I'm all ears.)

 Andy

 From: Anirudh Kondaveeti

 Hi all!

 The randomForest in R enables us to prune the trees using the nodesize
 feature where we can stop splitting a node if it contains
 less than the
 specified no.of of records/entities at that node.

 However is there a way to stop the tree growing after a
 specified number of
 levels. To be more clear on what I mean by a level. Level 0
 is the parent
 node, Level 1 has 2 daughter nodes, Level 2 has 4 daughter
 nodes, Level 3
 has 8 daughter nodes etc.

 Thanks in advance!

 Anirudh Kondaveeti
 

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 Notice:  This e-mail message, together with any attachme...{{dropped:12}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
==
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

Tough Times Never Last. But Tough People Do.  - Robert Schuller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] portable R editor

2009-03-02 Thread Wensui Liu
I feel emacs is portable enough for me.

On 3/2/09, Werner Wernersen pensterfuz...@yahoo.de wrote:

 Hi,

 I have been dreaming about a complete R environment on my USB stick for a
 long time. Now I finally want to realize it but what I am missing is a good,
 portable editor for R which has tabs and syntax highlighting, can execute
 code, has bookmarks and a little project file management facility pretty
 much like Tinn-R has those. I like Tinn-R but it seems like there is only a
 very old version of Tinn-R which works standalone.

 Can anyone recommend an adequate editor?

 Many thanks and all the best,
   Werner





 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
===
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

I can calculate the motion of heavenly bodies, but not the madness of people.”
--  Isaac Newton
===

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inefficiency of SAS Programming

2009-02-26 Thread Wensui Liu
Frank,
I couldn't locate the program you mentioned. doyou mind being more
specific? could you please point me to the file? i am just curious.
thanks.

On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr
f.harr...@vanderbilt.edu wrote:
 If anyone wants to see a prime example of how inefficient it is to program
 in SAS, take a look at the SAS programs provided by the US Agency for
 Healthcare Research and Quality for risk adjusting and reporting for
 hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm .
  The PSSASP3.SAS program is a prime example.  Look at how you do a vector
 product in the SAS macro language to evaluate predictions from a logistic
 regression model.  I estimate that using R would easily cut the programming
 time of this set of programs by a factor of 4.

 Frank
 --
 Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

I can calculate the motion of heavenly bodies, but not the madness of people.”
--  Isaac Newton
===

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems in Recommending R

2009-02-12 Thread Wensui Liu
my personal feeling about R website is that it is as good as how it should be.
i don't have any problem to navigate around and know exactly where I
can find the thing that I need.
instead, knime.org website looks too fancy and is all about marketing.
i don't think it is necessary for R team to waste limited resource on
something unnecessary. plus, R website is not bad at all, even
compared with the ones of other open source languages.

On Sun, Feb 1, 2009 at 9:52 PM, Ajay ohri ohri2...@gmail.com wrote:
 Dear List,
 One persistent feedback I am getting to people who are newly introduced to R
 ( especially in this cost cutting recession)  is -

 1) The website looks a bit old.

 While the current website does have a lot of hard work behind it, should n't
 a world class statistics package have a better website instead.

 You can check out www.knime.org which is an open source software , and free,
 and supports R---and notice the change in perception .

 Best Regards,

 Ajay Ohri

 www.decisionstats.com

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

I can calculate the motion of heavenly bodies, but not the madness of people.
--  Isaac Newton

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [OT] propensity score implementation

2008-11-08 Thread Wensui Liu
Dear All,

My question is more a statistical question than a R question. The reason I
am posting here is that there are lots of excellent statistician on this
list, who can always give me good advices.

Per my understanding, the purpose of propensity score is to reduce the bias
while estimating the treatment effect and its implementation is a 2-stage
model.

1) First of all, if we assume that T = 1 if an individual belongs to
treatment group and T = 0 otherwise. We further assume that X is a covariate
matrix to explain the assignment of treatment. Then the propensity score
should be the probability of treatment exposure T = 1 and can be formulated
as

PPscore = Prob(T=1|X) = exp(A * X) / [1 + exp(A * X)] in the range between 0
and 1.

2) At the second stage, let Y = 1 / 0 is a binary outcome variable and Z the
covariate matrix to explain outcome. In order to balance the probability of
an individual assigned to the treatment group such that Prob(Y = 1) _|_
Prob(T = 1|X), we should model the outcome as

Prob(Y = 1|Z) = exp(B * Z) / [1 + exp(B * Z)] weighting or matching by
Prob(T=1|X)

The above is just my general understanding about propensity score. However,
I was critisized that my understanding is wrong and was also told that the
response variable should be Y instead of T in the propensity model at the
1st stage. I am very confused and like to have the opinion of experts like
you guys.

Any insight will be appreciated.

Have a nice weekend!

wensui

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bias in sample - Logistic Regression

2008-10-01 Thread Wensui Liu
Hi, Shiva,

The idea of reject inference is very simple. Let's assume a credit card
environment. There are 100 applicants, out of which 50 will be approved and
booked in. Therefore, we can only observe the adverse behavior, such as
default and delinquency, of 50 booked accounts. Again, let's assume out of
50 booked cards, 5 are bad(default / delinquency). A normal thought is to
build a model to cherry pick bad guys and then apply the same model to all
applicants.

However, we can only observed the behavior of the applicants booked, which
is 50, but not all applicants, which is 100. Therefore, the model result
looks better than what it is supposed to be. This is so-called 'sample
bias'. The same thing can happen to healthcare or direct marketing as well.

Luckily enough, many people have done some excellent work on this problem.
Please do some readings by Heckman. Greene in NYU has paper in this area as
well. And I believe there is also implementation in R. If you use SAS(large
in industry), take a look at proc qlim.

HTH.

-- 
===
WenSui Liu
Acquisition Risk, Chase
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com
===

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to keep up with R?

2008-09-18 Thread Wensui Liu
Dear Listers,

I've been a big fan of R since graduate school. After working in the
industry for years, I haven't had many opportunities to use R and am mainly
using SAS. However, I am still forcing myself really hard to stay close to R
by reading R-help and books and writing R code by myself for fun. But by and
by, I start realizing I have hard time to keep up with R and am afraid that
I would totally forget how to program in R.

I really like it and am very unwilling to give it up. Is there any idea how
I might keep touch with R without using it in work on daily basis? I really
appreciate it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to keep up with R?

2008-09-18 Thread Wensui Liu
first of all, thanks so much for your insight.

regardless your bike skills, it is very inappropriate to compare riding bike
with using r. in additionally, simply knowing R is very different from using
it skillfully in production envirnment.


On Fri, Sep 19, 2008 at 1:24 AM, Rolf Turner [EMAIL PROTECTED]wrote:


 On 19/09/2008, at 5:01 PM, Wensui Liu wrote:

  Dear Listers,

 I've been a big fan of R since graduate school. After working in the
 industry for years, I haven't had many opportunities to use R and am
 mainly
 using SAS.


My most extreme sympathies and condolences!

  However, I am still forcing myself really hard to stay close to R


You have to *force* yourself???

  by reading R-help and books and writing R code by myself for fun. But by
 and
 by, I start realizing I have hard time to keep up with R and am afraid
 that
 I would totally forget how to program in R.

 I really like it and am very unwilling to give it up. Is there any idea
 how
 I might keep touch with R without using it in work on daily basis? I
 really
 appreciate it.


 To me, using R is like riding a bicycle.  Once you learn, you never
 forget!

 Actually that comparison is inappropriate in my case; such are my
 bicycling skills that I am much more likely to forget how to ride a
 bicycle than I am to forget how to use R.

 Of course one forgets *details*.  But those are just details.  And
 help.search() + RSiteSearch() will almost always recover those details
 for you.  If they don't, just ask R-help(), perhaps after putting on
 your asbestos suit.

cheers,

Rolf Turner

 ##
 Attention:This e-mail message is privileged and confidential. If you are
 not theintended recipient please delete the message and notify the
 sender.Any views or opinions presented are solely those of the author.

 This e-mail has been scanned and cleared by MailMarshal
 www.marshalsoftware.com
 ##




-- 
===
WenSui Liu
Acquisition Risk, Chase
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com
===

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] any package to do generalized linear mixed model?

2008-09-14 Thread Wensui Liu
I checked GlmmML package. However, it can only do binomial and poisson
distribution. How about others such as gamma or neg binomial?
Thank you so much!
wensui

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I don't know how to run a r-code written in emacs

2008-09-06 Thread Wensui Liu
did you install ess for emacs?

On Sat, Sep 6, 2008 at 11:52 AM, Luisa [EMAIL PROTECTED] wrote:

 Hi,
 I just installed R, I'm  work in UBUNTU and  I don't have idea about how to
 run a r-code written in emacs
 into the shell.
 Well I am in a shell, and obviously I can run simple commands over there,
 Must I  compile the program? if yes, How must I do that?
 what is the extension?

 I really appreciate your help

 --
 View this message in context: 
 http://www.nabble.com/I-don%27t-know-how-to-run-a-r-code-written-in-emacs-tp19348030p19348030.html
 Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
Acquisition Risk, Chase
Email : [EMAIL PROTECTED]
Blog : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulating Gaussian Mixture Models

2008-06-22 Thread Wensui Liu
Hi, Peng,
I had a piece of SAS code for 2-class gaussian mixture from my blog.
You might convert it to R code.

2-Class Gaussian Mixture Model in SAS
data d1;
  do i = 1 to 100;
x = ranuni(1);
e = rannor(1);
y = 5 * x + e;
output;
  end;
run;

data d2;
  do i = 1 to 100;
x = ranuni(2);
e = rannor(2);
y = 15 + 10 * x - 5 * x ** 2 + e;
output;
  end;
run;

data data;
  set d1 d2;
run;

proc nlmixed data = data tech = quanew maxiter = 1000;
  parms b10 = 0 b11 = 5 b12 = 0 b20 = 15 b21 = 10 b22 = -5
prior = 0.1 to 0.9 by 0.01 sigma = 1;

  mu1 = b10 + b11 * x + b12 * x * x;
  mu2 = b20 + b21 * x + b22 * x * x;
  pi = constant('pi');

  P1 = 1 / (((2 * pi) ** 0.5) * sigma) * exp(-0.5 * ((y - mu1) / sigma) ** 2);
  P2 = 1 / (((2 * pi) ** 0.5) * sigma) * exp(-0.5 * ((y - mu2) / sigma) ** 2);

  LH = P1 * prior + P2 * (1 - prior);
  LL = log(LH);

  model y ~ general(LL);
run;

/*
  Parameter Estimates

  Standard
 Parameter  Estimate ErrorDF  t Value  Pr  |t|   Alpha
 b10 -0.17440.3450   200-0.510.61370.05
 b11  5.34261.5040   200 3.550.00050.05
 b12-0.064541.4334   200-0.050.96410.05
 b20 15.36520.3099   20049.57.00010.05
 b21  9.62971.4970   200 6.43.00010.05
 b22 -5.47951.4776   200-3.710.00030.05
 prior0.5000   0.03536   20014.14.00010.05
 sigma1.0049   0.05025   20020.00.00010.05
*/
On 6/22/08, Peng Jiang [EMAIL PROTECTED] wrote:


  Hi,
   Is there any package that I can use to simulate the Gaussian Mixture Model
 , which is a mixture modeling method that is widely used in statistical
 learning theory.
   I know there is a mclust, however, I think it is a little bit different
 from my problem.
  Thanks very much..

   regards.








  --
  Peng Jiang
  江鹏
  Ph.D. Candidate

  Antai College of Economics  Management
  安泰经济管理学院
  Department of Mathematics
  数学系
  Shanghai Jiaotong University (Minhang Campus)
  800 Dongchuan Road
  200240 Shanghai
  P. R. China

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



-- 
===
WenSui Liu
Acquisition Risk, Chase
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com
===
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pros and Cons of R

2008-05-22 Thread Wensui Liu
agree,
i think R is more like a standard program language than SAS. however,
SAS programmer might not feel intuitive to pick up R.

On Thu, May 22, 2008 at 12:14 PM, Kevin E. Thorpe
[EMAIL PROTECTED] wrote:
 Monica Pisica wrote:

 Cons:

 - R has a very steep learning curve.

 I don't think the learning curve is any steeper than SAS programming,
 it is just a different kind of curve.


 --
 Kevin E. Thorpe
 Biostatistician/Trialist, Knowledge Translation Program
 Assistant Professor, Department of Public Health Sciences
 Faculty of Medicine, University of Toronto
 email: [EMAIL PROTECTED]  Tel: 416.864.5776  Fax: 416.864.6057

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
Call for Donations for the China Earthquake!
Blog : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [OT] xemacs on windows vista

2008-05-12 Thread Wensui Liu
Hi, dear all,
I just switch to vista (ultimate) and have heard there is some problem
for the installation of xemacs on vista. Is there any insight or
experience that you could share? I really appreciate any input.
thank you so much!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing data

2008-05-07 Thread Wensui Liu
import stata data should be straight. take a look at foreign package


On Wed, May 7, 2008 at 9:30 AM, Yemi Oyeyemi [EMAIL PROTECTED] wrote:
 Hi everyone, please I'm having problem importing data from Stata and excel. 
 Help me out.
   Thanks


  -
  [[elided Yahoo spam]]
 [[alternative HTML version deleted]]

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why R is 200 times slower than Matlab ?

2008-04-30 Thread Wensui Liu
Hi, ZD,
Your comment about speed is too general. Here is a benchmark
comparison among several languages and HTH.
http://www.sciviews.org/benchmark/index.html

On Wed, Apr 30, 2008 at 4:15 PM, Zhandong Liu
[EMAIL PROTECTED] wrote:
 I am switching from Matlab to R, but I found that R is 200 times slower than
  matlab.

  Since I am newbie to R, I must be missing some important programming tips.

  Please help me out on this.

  Here is the function:
  ## make the full pair-wise permutation of a vector
  ## input_fc=c(1,2,3);
  ## output_fc=(
  1 1 1 2 2 2 3 3 3
  1 2 3 1 2 3 1 2 3
  );

  grw_permute = function(input_fc){

  fc_vector = input_fc

  index = 1

  k = length(fc_vector)

  fc_matrix = matrix(0,2,k^2)

  for(i in 1:k){

  for(j in 1:k){

  fc_matrix[index]  =  fc_vector[i]

  fc_matrix[index+1]  =  fc_vector[j]

  index = index+2

  }

  }

  return(fc_matrix)

  }

  For an input vector of size 300. It took R 2.17 seconds to run.

  But the same code in matlab only needs 0.01 seconds to run.

  Am I missing sth in R.. Is there a away to optimize.  ???

  Thanks

  --
  Zhandong Liu

  Genomics and Computational Biology
  University of Pennsylvania

  616 BRB II/III, 421 Curie Boulevard
  University of Pennsylvania School of Medicine
  Philadelphia, PA 19104-6160

 [[alternative HTML version deleted]]

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to ask for *fixed* number of distributions under parameterized Gaussian mixture model.

2008-04-03 Thread Wensui Liu
Hi, Chen,
I don't know how you are doing it.
however, per my limited knowledge, it is easy with flexmix package.



On Thu, Apr 3, 2008 at 6:26 AM, Hung-Hsuan Chen (Sean)
[EMAIL PROTECTED] wrote:
 Dear R users:
  I am wondering how to ask for *fixed* number of distributions under
  parameterized Gaussian mixture model.

  I know that em() and some related functions can predict the
  parameterized Gaussian mixture model. However, there seems no
  parameter to decide number of distributions to be mixed (if we known
  the value in advance).

  That is, assume I know the (mixed)data is from 3 different
  distributions. The output, however, may indicate that number of
  distributions that form the data is 4. How can I assign number of
  distributions is 3 in advance?

  Thanks a lot for your help.

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam - Extraction of nonparametric component

2008-03-10 Thread Wensui Liu
i could remember exactly but it is like something: components -
predict(gam.object, type = terms)

On Mon, Mar 10, 2008 at 1:36 PM, Michael A. Milligan [EMAIL PROTECTED] wrote:
 Hello,

  I am estimating a semiparametric partial linear model
  using gam of the form

  y=f1(x1)+f2(x2)+beta*X

  where y is the dependent variable, f1(x1) and f2(x2)
  are nonparametric functions of the independent
  variables x1 and x2, respectively, and beta and X are
  vectors of coefficients and independent variables.
  The R code is

  EqGamAS -  gam(y ~ X+s(x1)+s(x2))

  My question is, how can I extract the fitted values
  of, say, f1(x1)?  Of course fitted(EqGamAS) returns
  the fitted values of the entire regression function,
  but is there a way to view only one component of the
  nonparametric part of the estimation?  I have looked
  through documentation and help archives and have not
  found the answer.  I appreciate very much any help
  anyone can give me.

  Michael Milligan
  Doctoral Candidate
  University of New Mexico


   
 
  Be a better friend, newshound, and

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] emacs and R

2008-03-02 Thread Wensui Liu
Hi, John,
you don't have to switch to linux in order to use ess + emacs with R.
just follow the installation instruction of ess and it will take you 5
minutes at most. i also feel that xemacs seems more friendly than
gnuemacs for windows user.

On Sun, Mar 2, 2008 at 3:40 PM, John Sorkin [EMAIL PROTECTED] wrote:
 At the suggestion of many people, I have installed emacs on my linux (Fedora 
 8.0) computer with the intention of using emacs as window interface to R 
 (2.6.0). I have gone though the emacs tutorial and don't see any information 
 about how I should use emacs to run R. Can anyone suggest a document that I 
 might read? In the past I have used R on a Windows XP system and used the 
 built-in windowing interface.
  Thank you,
  John

  John Sorkin M.D., Ph.D.
  Chief, Biostatistics and Informatics
  Baltimore VA Medical Center GRECC,
  University of Maryland School of Medicine Claude D. Pepper OAIC,
  University of Maryland Clinical Nutrition Research Unit, and
  Baltimore VA Center Stroke of Excellence

  University of Maryland School of Medicine
  Division of Gerontology
  Baltimore VA Medical Center
  10 North Greene Street
  GRECC (BT/18/GR)
  Baltimore, MD 21201-1524

  (Phone) 410-605-7119
  (Fax) 410-605-7913 (Please call phone number above prior to faxing)
  [EMAIL PROTECTED]
  Confidentiality Statement:
  This email message, including any attachments, is for th...{{dropped:6}}

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R data Export to Excel

2008-03-02 Thread Wensui Liu
hi,
did you try write.xls in xlsReadWrite package?


On Sun, Mar 2, 2008 at 9:59 PM, Keizer_71 [EMAIL PROTECTED] wrote:

  Here is my R Code

  x-1:2
  y-2:141
  data.matrix-data.matrix(data[,y])#create data.matrix
  variableprobe-apply(data.matrix[x,],1,var)
  variableprobe #output variance across probesets
  hist(variableprobe) #displaying histogram of variableprobe
  write.table(cbind(data[1],
  Variance=apply(data[,y],1,var)),file='c://variance.csv')
  #export as a .csv file.

  Output in Excel
  all in 1 column.

  ProbeID Variance
  1 224588_at 21.5825745738848

  How do i separate them so that i can have three columns

  ProbeID  Variance
  1   224588_at   21.582.

  thanks,
  Kei


  --
  View this message in context: 
 http://www.nabble.com/R-data-Export-to-Excel-tp15796903p15796903.html
  Sent from the R help mailing list archive at Nabble.com.

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R data Export to Excel

2008-03-02 Thread Wensui Liu
i think you simply install it in the way you install other R packages.

On Sun, Mar 2, 2008 at 10:40 PM, Christophe Lo [EMAIL PROTECTED] wrote:
 thanks for your response. How do i install it? I try looking at the manual
 it doesn't seem indicate any installation instruction. I also download a
 windows version but it doesn't have an exe file.

 http://cran.r-project.org/web/packages/xlsReadWrite/index.html

 Newbie,
 Kei




 On 3/3/08, Wensui Liu [EMAIL PROTECTED] wrote:
  hi,
  did you try write.xls in xlsReadWrite package?
 
 
  On Sun, Mar 2, 2008 at 9:59 PM, Keizer_71 [EMAIL PROTECTED] wrote:
  
Here is my R Code
  
x-1:2
y-2:141
data.matrix-data.matrix(data[,y])#create data.matrix
variableprobe-apply(data.matrix[x,],1,var)
variableprobe #output variance across probesets
hist(variableprobe) #displaying histogram of variableprobe
write.table(cbind(data[1],
Variance=apply(data[,y],1,var)),file='c://variance.csv')
#export as a .csv file.
  
Output in Excel
all in 1 column.
  
ProbeID Variance
1 224588_at 21.5825745738848
  
How do i separate them so that i can have three columns
  
ProbeID  Variance
1   224588_at   21.582.
  
thanks,
Kei
  
  
--
View this message in context:
 http://www.nabble.com/R-data-Export-to-Excel-tp15796903p15796903.html
Sent from the R help mailing list archive at Nabble.com.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  --
  ===
  WenSui Liu
  ChoicePoint Precision Marketing
  Phone: 678-893-9457
  Email : [EMAIL PROTECTED]
  Blog   : statcompute.spaces.live.com
  ===
 



 --
 Christophe Lo
 (078) 8275 7029
 [EMAIL PROTECTED]



-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read.xport function in package foreign

2008-02-25 Thread Wensui Liu
Hi, Gary,
I have exchanged data between SAS and R all the time using SASxport
package and found no problem at all.
It might be worth for you to try.

On Mon, Feb 25, 2008 at 10:12 AM, Nelson, Gary (FWE)
[EMAIL PROTECTED] wrote:
 Hi All,

  Sorry that I didn't provide enough information.

  I've been trying to import SAS xport files that contain multiple files
  using package foreign's read.xport.  I first attempted this back in 2005
  and had problems. Some of files that were present in the SAS xport file
  weren't being created in R.  I submitted my problem to the community:
  http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57864.html

  and Dr. Dalgaard confirmed the issue:
  http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57868.html

  From what I've read, this problem was identified before 2005:
  http://tolstoy.newcastle.edu.au/R/help/03a/0527.html

  After updating to version 2.6.1 recently, I decided to try it again, but
  the problem still exists.  I spent time trying to determine the issue
  but I don't understand the IEEE coding.  I did discover that the data
  from the missing files are actually being included in one of the data
  frames but the data were read incorrectly.
  The XPORT files come from ftp://cusk.nmfs.noaa.gov/mrfss/intercept/ag/
  if anyone wants to try it.
  I used the .xpt in int82ag.zip (all years except 1985,1988 and 1989
  are read incorrectly) and it appears they were created using SASV5XPT.
  Once imported, look in data frame I3_19822 which should have only 3650
  records, but there are 76,187 records.


  I am using version:
  platform   i386-pc-mingw32
  arch   i386
  os mingw32
  system i386, mingw32
  status
  major  2
  minor  6.1
  year   2007
  month  11
  day26
  svn rev43537
  language   R
  version.string R version 2.6.1 (2007-11-26)

  And package foreign is version 0.8-23.

  I am wondering if anyone has created a fix that isn't available yet?


  Thanks,
  Gary Nelson.

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog   : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >