Re: [R] Scaling Matrix in qda() function in MASS package

2017-08-24 Thread William Dunlap via R-help
If you multiply the data for a certain group by the scaling matrix for that
group, the variance matrix will be the identity.  E.g.,

> z <- qda(iris[-5], grouping=iris$Species)
> zapsmall(var( as.matrix(subset(iris, Species=="virginica", 1:4)) %*%
z$scaling[,,"virginica"] ))
  1 2 3 4
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
> zapsmall(var( as.matrix(subset(iris, Species=="versicolor", 1:4)) %*%
z$scaling[,,"versicolor"] ))
  1 2 3 4
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Aug 24, 2017 at 12:54 PM, Ranjan Maitra  wrote:

> I guess the question that is being asked here is what is the scaling
> matrix that is being returned in the qda object. The help file on qda()
> says:
> ...
> scaling: for each group ‘i’, ‘scaling[,,i]’ is an array which transforms
> observations so that within-groups covariance matrix is spherical.
> ...
>
> This is a bit ambiguous. I tried a few cases (spectral, QR decomposition,
> especially given that it is an upper triangular matrix) but was unable to
> match the result.
>
> Unless someone knows, there is no recourse but to muck through source code.
>
> Btw, I think the following will give the necessary source code:
>
> MASS:::qda.default
>
>
> Hope this helps!
>
> Best wishes,
> Ranjan
>
>
> On Wed, 23 Aug 2017 15:58:30 -0700 Bert Gunter 
> wrote:
>
> > You need to learn how to access code for nonexported methods.  See ? "::"
> >
> > > methods(qda)
> > [1] qda.data.frame* qda.default*qda.formula*qda.matrix*
> > see '?methods' for accessing help and source code
> >
> > Shows you that the methods are not exported from the namespace. Hence
> > you need to use the triple colon operator to see their code:
> >
> > > MASS:::qda
> >
> > Once you have the code, I presume this will answer your question.
> >
> > Cheers,
> > Bert
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Wed, Aug 23, 2017 at 2:44 PM, Souradeep Chattapadhyay
> >  wrote:
> > > Hello,
> > >I am Souradeep Chattopadhyay and I am a graduate student at
> Iowa
> > > State University Department of Statistics.
> > >
> > > Can anyone please explain the mathematical formulation behind the
> scaling
> > > matrix returned by the qda function in MASS package. I want to
> understand
> > > how this scaling matrix is derived from the inputs given to the qda
> > > function.
> > >
> > > Example Code
> > >
> > > The following example is using the banknote data in the MCLUST package.
> > >
> > > *Code*
> > >
> > > require(MASS)
> > > require(mclust)
> > > data(banknote)
> > > quad<-qda(banknote[,-1], grouping=banknote$Status, method="mle")
> > > quad$scaling
> > >
> > >
> > > Scaling matrix returned by qda for this data is
> > >
> > > , , counterfeit
> > >
> > > 12  3
> > >  4 5  6
> > > Length   2.853988  1.069414 -0.05279774  0.750531723 -0.2053821
> 0.6986088
> > > Left0.00 -4.208108 -3.04707132 -0.026804815 -0.8644062
> > > -1.1088947
> > > Right  0.00  0.00  4.27383763  0.003205759  0.3313675
> 1.3865888
> > > Bottom   0.00  0.00  0.  0.917596063 -0.8707772
> 0.7274894
> > > Top 0.00  0.00  0.  0.0 -2.2041415
> > >  0.6956074
> > > Diagonal 0.00  0.00  0.  0.0
> 0.000-2.1879157
> > >
> > > , , genuine
> > >
> > > 1 2   3
> 4
> > >   5   6
> > > Length  2.592911 -1.169164  0.6105339 -0.3614352 -0.2520496 -0.5281743
> > > Left   0.00  3.027882  2.2392994 -0.2842368 -1.2092325
> 0.6927868
> > > Right0.00  0.00 -3.8684746 -0.3972362 -0.4177546 -0.1062555
> > > Bottom 0.00  0.00  0.000  1.6376150  1.7274240  0.3969998
> > > Top   0.00  0.00  0.000  0.000  2.3022115
> 0.6318543
> > > Diagonal 0.00  0.00  0.000  0.000  0.000  2.4516680
> > >
> > >
> > >
> > > Thanks and Regards
> > >
> > > Souradeep
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --

Re: [R] rmutil parameters for Pareto distribution

2017-08-24 Thread Jeff Newmiller
You found that not all deviates are larger than m.

Comparing the formulas quoted in Wikipedia and in your email it is obvious that 
m is not the same as x_m.

If you don't find the rmutil formulation useful you can try another such as the 
one in PtProcess, or write your own. 
-- 
Sent from my phone. Please excuse my brevity.

On August 24, 2017 12:23:50 PM PDT, Justin Thong  
wrote:
>In https://en.wikipedia.org/wiki/Pareto_distribution, it is clear what
>the
>parameters are for the pareto distribution: *xmin *the scale parameter
>and
>*a* the shape parameter.
>
>I am using rmutil to generate random deviates from a pareto
>distribution.
>It says in the documentation that the probabilty density of the pareto
>distribution
>
>The Pareto distribution has density
>
>f(y) = s (1 + y/(m (s-1)))^(-s-1)/(m (s-1))
>
>where m is the mean parameter of the distribution and s is the
>dispersion
>
>Through my experimentation of using rpareto function from the library
>using
>m as the scale parameter *xmin* value and s as the shape parameter* a*
>, I
>found that the deviates generated are not all larger than *xmin*. This
>leads me to believe that m and s are not the shape and scale parameter
>respectively.
>
>What is m and s? Could it be defined as the mean and variance
>respectively
> as shown on the wikipedia link?
>
>
>Yours sincerely,
>Justin
>
>*I check my email at 9AM and 4PM everyday*
>*If you have an EMERGENCY, contact me at +447938674419(UK) or
>+60125056192(Malaysia)*
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] functions from 'base' package are not accessible

2017-08-24 Thread Hadley Wickham
This was a change in tidyr 0.7.0 that is causing a lot of confusion,
so we are preparing tidyr 0.7.1 which will back this change out.

If you want a work around in the meantime, you can express your
operation a bit more elegantly as:

library(tidyr)

df <- data.frame(v1 = 1:5, somestring = 6:10, v3 = 11:15, v4 = 16:20)
df %>%
  gather(key = var, value = val, somestring:ncol(df)) %>%
  head(2)

Hadley

On Thu, Aug 24, 2017 at 6:32 AM, Eugeny Melamud
 wrote:
> Hi all!
>
> The following code (executed in console)...
> somevar <- data.frame(v1 = 1:5, somestring = 6:10, v3 = 11:15, v4 = 
> 16:20);
> somevar %>% gather(key = var, value = val, which(names(somevar) == 
> "somestring"):length(somevar)) %>% head(2);
> throws...
> Error in which(names(somevar) == "somestring") :
>   could not find function "which"
>
> if I change which(names(somevar) == "somestring") with 0 I'll get
>Error in length(somevar) :
>   could not find function "length"
>
> So it looks like base package is not loaded. Still if type 'which' in console 
> I get
>   function (x, arr.ind = FALSE, useNames = TRUE)
>   {
> wh <- .Internal(which(x))
> if (arr.ind && !is.null(d <- dim(x)))
> arrayInd(wh, d, dimnames(x), useNames = useNames)
> else wh
>   }
>   
>   
>
> base (that contains which function) package is installed. R version is 3.4.1 
> and system is Win8
>
> Where should I look to understand how to fix the problem?
>
> Thank you in advance!
> Eugeny
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
http://hadley.nz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scaling Matrix in qda() function in MASS package

2017-08-24 Thread Ranjan Maitra
I guess the question that is being asked here is what is the scaling matrix 
that is being returned in the qda object. The help file on qda() says:
...
scaling: for each group ‘i’, ‘scaling[,,i]’ is an array which transforms 
observations so that within-groups covariance matrix is spherical.
...

This is a bit ambiguous. I tried a few cases (spectral, QR decomposition, 
especially given that it is an upper triangular matrix) but was unable to match 
the result. 

Unless someone knows, there is no recourse but to muck through source code.

Btw, I think the following will give the necessary source code:

MASS:::qda.default


Hope this helps!

Best wishes,
Ranjan


On Wed, 23 Aug 2017 15:58:30 -0700 Bert Gunter  wrote:

> You need to learn how to access code for nonexported methods.  See ? "::"
> 
> > methods(qda)
> [1] qda.data.frame* qda.default*qda.formula*qda.matrix*
> see '?methods' for accessing help and source code
> 
> Shows you that the methods are not exported from the namespace. Hence
> you need to use the triple colon operator to see their code:
> 
> > MASS:::qda
> 
> Once you have the code, I presume this will answer your question.
> 
> Cheers,
> Bert
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Wed, Aug 23, 2017 at 2:44 PM, Souradeep Chattapadhyay
>  wrote:
> > Hello,
> >I am Souradeep Chattopadhyay and I am a graduate student at Iowa
> > State University Department of Statistics.
> >
> > Can anyone please explain the mathematical formulation behind the scaling
> > matrix returned by the qda function in MASS package. I want to understand
> > how this scaling matrix is derived from the inputs given to the qda
> > function.
> >
> > Example Code
> >
> > The following example is using the banknote data in the MCLUST package.
> >
> > *Code*
> >
> > require(MASS)
> > require(mclust)
> > data(banknote)
> > quad<-qda(banknote[,-1], grouping=banknote$Status, method="mle")
> > quad$scaling
> >
> >
> > Scaling matrix returned by qda for this data is
> >
> > , , counterfeit
> >
> > 12  3
> >  4 5  6
> > Length   2.853988  1.069414 -0.05279774  0.750531723 -0.2053821  0.6986088
> > Left0.00 -4.208108 -3.04707132 -0.026804815 -0.8644062
> > -1.1088947
> > Right  0.00  0.00  4.27383763  0.003205759  0.3313675  1.3865888
> > Bottom   0.00  0.00  0.  0.917596063 -0.8707772  0.7274894
> > Top 0.00  0.00  0.  0.0 -2.2041415
> >  0.6956074
> > Diagonal 0.00  0.00  0.  0.0  0.000-2.1879157
> >
> > , , genuine
> >
> > 1 2   34
> >   5   6
> > Length  2.592911 -1.169164  0.6105339 -0.3614352 -0.2520496 -0.5281743
> > Left   0.00  3.027882  2.2392994 -0.2842368 -1.2092325  0.6927868
> > Right0.00  0.00 -3.8684746 -0.3972362 -0.4177546 -0.1062555
> > Bottom 0.00  0.00  0.000  1.6376150  1.7274240  0.3969998
> > Top   0.00  0.00  0.000  0.000  2.3022115  0.6318543
> > Diagonal 0.00  0.00  0.000  0.000  0.000  2.4516680
> >
> >
> >
> > Thanks and Regards
> >
> > Souradeep
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Important Notice: This mailbox is ignored: e-mails are set to be deleted on 
receipt. Please respond to the mailing list if appropriate. For those needing 
to send personal or professional e-mail, please use appropriate addresses.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rmutil parameters for Pareto distribution

2017-08-24 Thread Justin Thong
In https://en.wikipedia.org/wiki/Pareto_distribution, it is clear what the
parameters are for the pareto distribution: *xmin *the scale parameter and
*a* the shape parameter.

I am using rmutil to generate random deviates from a pareto distribution.
It says in the documentation that the probabilty density of the pareto
distribution

The Pareto distribution has density

f(y) = s (1 + y/(m (s-1)))^(-s-1)/(m (s-1))

where m is the mean parameter of the distribution and s is the dispersion

Through my experimentation of using rpareto function from the library using
m as the scale parameter *xmin* value and s as the shape parameter* a* , I
found that the deviates generated are not all larger than *xmin*. This
leads me to believe that m and s are not the shape and scale parameter
respectively.

What is m and s? Could it be defined as the mean and variance respectively
 as shown on the wikipedia link?


Yours sincerely,
Justin

*I check my email at 9AM and 4PM everyday*
*If you have an EMERGENCY, contact me at +447938674419(UK) or
+60125056192(Malaysia)*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Flummoxed by gsub().

2017-08-24 Thread David Winsemius

> On Aug 24, 2017, at 10:20 AM, David Winsemius  wrote:
> 
> 
>> On Aug 23, 2017, at 2:29 AM, Rolf Turner  wrote:
>> 
>> 
>> On 23/08/17 18:33, Stefan Evert wrote:
>> 
 On 23 Aug 2017, at 07:45, Rolf Turner  wrote:
 
 My reading of ?regex led me to believe that
 
   gsub("[:alpha:]","",x)
 
 should give the result that I want.
>>> That's looking for any of the characters a, l, p, h, : .
>> 
>> OK.  I see that now.  I don't think that it's really stated anywhere that to 
>> search for (and possibly change) any one of a string of characters you 
>> enclose that string of characters in brackets [  ].
> 
> That's explained on the ?regex page in the section on character classes. The 
> source of confusion for you is that within regex character classes there is 
> also a set of reserved constructions that all start and end with "[:" and 
> ":]". It's a bit like needed to double or triple escape characters in regex. 
> a leading "|" changes the parser settings (or "expectations" if one wants to 
> anthropomorphize the process.

I meant a leading backslash "\" rather than a vertical bar ("|")

-- 
David.
> 
>> 
>> The first example from ?grep makes this "clear" (for some value of the word 
>> "clear") once you understand what this example is on about.
>> 
>> So it's "obvious" once you've been shown, and totally opaque until then.
> 
> Sometimes we all stumble over syntactic "special" detours. If you wanted to 
> add a warning to the current ?regex tex, you could submit a diff for the base 
> package, perhaps with something like:
> 
> "Certain named classes of characters are predefined. Their interpretation 
> depends on the locale (see locales); the interpretation below is that of the 
> POSIX locale."
> 
> Replaced with:
> 
> "Certain named classes of characters are predefined. Their interpretation 
> depends on the locale (see locales); the interpretation below is that of the 
> POSIX locale. Their names do include the "[:" and ":]" characters."
> 
> 
>> 
>>> What you meant to say was
>>> gsub("[[:alpha:]]","",x)
>>> i.e. the character class [:alpha:] within a character set.
>> 
>> Yup.  Got it.  Thanks very much.
>> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] functions from 'base' package are not accessible

2017-08-24 Thread William Dunlap via R-help
Or perhaps two exclamation points would be better ('unquote' in the
tidyverse lexicon, 3 bangs is 'unquote-splice').

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Aug 24, 2017 at 10:20 AM, William Dunlap  wrote:

> Try putting !!! (three exclamation symbols) in front of which(...)==
>
> The non-standard evaluation in the tidyverse can cause confusion.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Aug 24, 2017 at 4:32 AM, Eugeny Melamud <
> eugeny.mela...@lanit-tercom.com> wrote:
>
>> Hi all!
>>
>> The following code (executed in console)...
>> somevar <- data.frame(v1 = 1:5, somestring = 6:10, v3 = 11:15, v4 =
>> 16:20);
>> somevar %>% gather(key = var, value = val, which(names(somevar) ==
>> "somestring"):length(somevar)) %>% head(2);
>> throws...
>> Error in which(names(somevar) == "somestring") :
>>   could not find function "which"
>>
>> if I change which(names(somevar) == "somestring") with 0 I'll get
>>Error in length(somevar) :
>>   could not find function "length"
>>
>> So it looks like base package is not loaded. Still if type 'which' in
>> console I get
>>   function (x, arr.ind = FALSE, useNames = TRUE)
>>   {
>> wh <- .Internal(which(x))
>> if (arr.ind && !is.null(d <- dim(x)))
>> arrayInd(wh, d, dimnames(x), useNames = useNames)
>> else wh
>>   }
>>   
>>   
>>
>> base (that contains which function) package is installed. R version is
>> 3.4.1 and system is Win8
>>
>> Where should I look to understand how to fix the problem?
>>
>> Thank you in advance!
>> Eugeny
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] functions from 'base' package are not accessible

2017-08-24 Thread Jeff Newmiller
Looks like a bug to me.  I think you need to correspond with the package 
(tidyr?) maintainer, perhaps by putting a bug report on GitHub.

Next time please make your example reproducible by including the necessary 
"library" function calls. 
-- 
Sent from my phone. Please excuse my brevity.

On August 24, 2017 4:32:05 AM PDT, Eugeny Melamud 
 wrote:
>Hi all!
>
>The following code (executed in console)...
>somevar <- data.frame(v1 = 1:5, somestring = 6:10, v3 = 11:15, v4 =
>16:20);
>somevar %>% gather(key = var, value = val, which(names(somevar) ==
>"somestring"):length(somevar)) %>% head(2);
>throws...
>Error in which(names(somevar) == "somestring") :
>  could not find function "which"
>
>if I change which(names(somevar) == "somestring") with 0 I'll get
>   Error in length(somevar) :
>  could not find function "length"
>
>So it looks like base package is not loaded. Still if type 'which' in
>console I get
>  function (x, arr.ind = FALSE, useNames = TRUE)
>  {
>wh <- .Internal(which(x))
>if (arr.ind && !is.null(d <- dim(x)))
>arrayInd(wh, d, dimnames(x), useNames = useNames)
>else wh
>  }
>  
>  
>
>base (that contains which function) package is installed. R version is
>3.4.1 and system is Win8
>
>Where should I look to understand how to fix the problem?
>
>Thank you in advance!
>Eugeny
>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] functions from 'base' package are not accessible

2017-08-24 Thread William Dunlap via R-help
Try putting !!! (three exclamation symbols) in front of which(...)==

The non-standard evaluation in the tidyverse can cause confusion.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Aug 24, 2017 at 4:32 AM, Eugeny Melamud <
eugeny.mela...@lanit-tercom.com> wrote:

> Hi all!
>
> The following code (executed in console)...
> somevar <- data.frame(v1 = 1:5, somestring = 6:10, v3 = 11:15, v4 =
> 16:20);
> somevar %>% gather(key = var, value = val, which(names(somevar) ==
> "somestring"):length(somevar)) %>% head(2);
> throws...
> Error in which(names(somevar) == "somestring") :
>   could not find function "which"
>
> if I change which(names(somevar) == "somestring") with 0 I'll get
>Error in length(somevar) :
>   could not find function "length"
>
> So it looks like base package is not loaded. Still if type 'which' in
> console I get
>   function (x, arr.ind = FALSE, useNames = TRUE)
>   {
> wh <- .Internal(which(x))
> if (arr.ind && !is.null(d <- dim(x)))
> arrayInd(wh, d, dimnames(x), useNames = useNames)
> else wh
>   }
>   
>   
>
> base (that contains which function) package is installed. R version is
> 3.4.1 and system is Win8
>
> Where should I look to understand how to fix the problem?
>
> Thank you in advance!
> Eugeny
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Flummoxed by gsub().

2017-08-24 Thread David Winsemius

> On Aug 23, 2017, at 2:29 AM, Rolf Turner  wrote:
> 
> 
> On 23/08/17 18:33, Stefan Evert wrote:
> 
>>> On 23 Aug 2017, at 07:45, Rolf Turner  wrote:
>>> 
>>> My reading of ?regex led me to believe that
>>> 
>>>gsub("[:alpha:]","",x)
>>> 
>>> should give the result that I want.
>> That's looking for any of the characters a, l, p, h, : .
> 
> OK.  I see that now.  I don't think that it's really stated anywhere that to 
> search for (and possibly change) any one of a string of characters you 
> enclose that string of characters in brackets [  ].

That's explained on the ?regex page in the section on character classes. The 
source of confusion for you is that within regex character classes there is 
also a set of reserved constructions that all start and end with "[:" and ":]". 
It's a bit like needed to double or triple escape characters in regex. a 
leading "|" changes the parser settings (or "expectations" if one wants to 
anthropomorphize the process.

> 
> The first example from ?grep makes this "clear" (for some value of the word 
> "clear") once you understand what this example is on about.
> 
> So it's "obvious" once you've been shown, and totally opaque until then.

Sometimes we all stumble over syntactic "special" detours. If you wanted to add 
a warning to the current ?regex tex, you could submit a diff for the base 
package, perhaps with something like:

"Certain named classes of characters are predefined. Their interpretation 
depends on the locale (see locales); the interpretation below is that of the 
POSIX locale."

Replaced with:

"Certain named classes of characters are predefined. Their interpretation 
depends on the locale (see locales); the interpretation below is that of the 
POSIX locale. Their names do include the "[:" and ":]" characters."


> 
>> What you meant to say was
>>  gsub("[[:alpha:]]","",x)
>> i.e. the character class [:alpha:] within a character set.
> 
> Yup.  Got it.  Thanks very much.
> 
> cheers,
> 
> Rolf
> 
> -- 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pull data from Tally 9.1 to R studio

2017-08-24 Thread Marc Schwartz
Hi,

Inline below.


> On Aug 24, 2017, at 5:22 AM, John Kane via R-help  
> wrote:
> 
> 
> IIt might help to read the material at one or both of these links 
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> 



In this case, had Jagan spent about 30 seconds Googling for the application 
developer's site, he would have found that Tally software has a support page 
here:

  https://tallysolutions.com/support/ 

and that using the online help that they provide, searching for "export data":

  
https://help.tallysolutions.com/tallyweb/modules/pss/crm/kb/search/CKBTallyHelpSearchWIC.php#searchPage=1=export%20data
 


he would find the following article in the initial set of returned hits that 
appears to describe the export file formats that Tally supports:

  
https://help.tallysolutions.com/article/te9rel60/Data_Management/Export_of_Data_Intro.htm
 


One of which is CSV files, which can then be imported into R, using ?read.csv.

He might have also been able to quickly obtain similar guidance from his own IT 
support folks who presumably support the Tally ERP installation or from Tally 
directly. The initial step is less about R specifically and more about how to 
get data out of Tally's database in an industry standard format.

If Tally supports any kind of direct API or other interface (e.g. ODBC or 
similar), Jagan would need to pursue that conversation with his IT folks and/or 
Tally and to see if there would be a way for R to interface with that path. 
Also, importantly, whether or not under any restricted IT access policies, that 
would even be allowed. Many companies restrict such access for security reasons.

Another Google search would suggest that Tally uses a proprietary database 
engine and language, but that some interfaces (e.g. ODBC) may exist, if enabled 
locally.

  
http://www.tallysolutions.com/website/html/tallydeveloper/architecture-series-a.php
 



On the R side, there is the R Data Import/Export Manual here:

  https://cran.r-project.org/manuals.html 


which provides additional general insight into R's import/export capabilities.


The R Posting Guide, as has been mentioned, is a good resource, as is the 
Getting Help With R page, linked on the main R Project page:

  https://www.r-project.org/help.html 


Regards,

Marc Schwartz




> 
> On Thursday, August 24, 2017, 6:19:25 AM EDT, John Kane  
> wrote:
> 
> 
> 
> 
> On Thursday, August 24, 2017, 1:50:13 AM EDT, David Winsemius 
>  wrote:
> 
> 
>> On Aug 22, 2017, at 11:31 PM, jagan krishnan via R-help 
>>  wrote:
>> 
>> Hi all,
>> This is Jagan.i have been provided a task of analyzing sales data of a 
>> company in R programming...Just wanted to know,how can I pull Tally 9.1 
>> software data into R programming dataframe.
>> Waiting eagerly for your inputs.
>> With Regards,Jagannathan Krishnan
>> Sent from Yahoo Mail on Android 
>> 
>>   On Tue, Aug 22, 2017 at 10:47 AM, jagan 
>> krishnan wrote:  Hi all,
>> This is Jagan.i have been provided a task of analyzing sales data of a 
>> company in R programming...Just wanted to know,how can I pull Tally 9.1 
>> software data into R programming dataframe.
>> Waiting eagerly for your inputs.
> 
> Are we supposed to know what "Tally 9.1 software data" might look like?
> Of course. Just download Tally 9.1, stick in some data  and you're away.
> A quick google shows it is accounting software.
> 
> 
>> With Regards,Jagannathan Krishnan
>> 
>> Sent from Yahoo Mail on Android  
>> 
>> [[alternative HTML version deleted]]
> 
> A Posting Guide was prepared for you. I suggest that you should read it.
> 
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 'Any technology distinguishable from magic is insufficiently advanced.'  
> -Gehm's Corollary to Clarke's Third Law


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cross validation in random forest using rfcv functin

2017-08-24 Thread David Winsemius

> On Aug 23, 2017, at 10:59 AM, Elahe chalabi via R-help  
> wrote:
> 
> Any responds?!

When I look at the original post a I see a question about a function named 
`rfcv` but do not see a `library` call to load such a function. I also see a 
reference to a help page or vignette, perhaps?, from that un-identified 
package. So it appears to me that you expect the rest of us to go searching for 
that function if we do not use it on a rtegular basis. You also apparently 
expect use to construct a dataset to reconstruct a dataset for testing. I'm not 
inclined to make all that effort, and from the crashing silence of the last 24 
hours on this venue, it appears I am not alone in thinking you presume too 
much. Read the Posting Guide and try to better understand why your behavior 
might not be eliciting the level of interest you were hoping for.

-- David.


> 
> 
> 
> On Wednesday, August 23, 2017 5:50 AM, Elahe chalabi via R-help 
>  wrote:
> 
> 
> 
> Hi all,
> 
> 
> I would like to do cross validation in random forest using rfcv function. As 
> the documentation for this package says:
> 
> 
> rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=function(p) 
> max(1, floor(sqrt(p))), recursive=FALSE, ...)
> 
> 
> however I don't know how to build trianx and trainy for my data set, and I 
> could not understand the way trainx is built in the package documentation 
> example for iris data set.
> 
> Here is my data set and I want to do cross validation to see accuracy in 
> classifying Alzheimer and Control Group:
> 
> 
> str(data)
> 
> 'data.frame':499 obs. of  606 variables:
> 
> $ Gender: int  0 0 0 0 0 1 1 1 1 1 ...
> 
> $ NumOfWords: num  157 111 163 176 100 124 201 100 76 101
> 
> $ NumofLivings  : int  6 6 9 4 3 5 3 3 4 3 ...
> 
> $ NumofStopWords: num  77 45 87 91 46 64 104 37 32 41 ...
> 
> .
> 
> .
> 
> $ Group : Factor w/ 2 levels "Alzheimer","Control","Control"..:
> 
> 
> So basically trainy should be data$Group but how about trainx? Could anyone 
> help me in this?
> 
> 
> 
> Thanks for any help!
> 
> Elahe
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] functions from 'base' package are not accessible

2017-08-24 Thread Bert Gunter
Inline.

-- Bert



On Thu, Aug 24, 2017 at 4:32 AM, Eugeny Melamud
 wrote:
> Hi all!
>
> The following code (executed in console)...
> somevar <- data.frame(v1 = 1:5, somestring = 6:10, v3 = 11:15, v4 = 
> 16:20);
> somevar %>% gather(key = var, value = val, which(names(somevar) == 
> "somestring"):length(somevar)) %>% head(2);
> throws...
> Error in which(names(somevar) == "somestring") :
>   could not find function "which"
>
> if I change which(names(somevar) == "somestring") with 0 I'll get
>Error in length(somevar) :
>   could not find function "length"
>
> So it looks like base package is not loaded. Still if type 'which' in console 
> I get

Nope. data.frame() is in the base package and ran just fine.

Possibly check your usage of the "%>%" construction (which I don't use
and so can't help with).

Cheers,
Bert


>   function (x, arr.ind = FALSE, useNames = TRUE)
>   {
> wh <- .Internal(which(x))
> if (arr.ind && !is.null(d <- dim(x)))
> arrayInd(wh, d, dimnames(x), useNames = useNames)
> else wh
>   }
>   
>   
>
> base (that contains which function) package is installed. R version is 3.4.1 
> and system is Win8
>
> Where should I look to understand how to fix the problem?
>
> Thank you in advance!
> Eugeny
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] functions from 'base' package are not accessible

2017-08-24 Thread Eugeny Melamud
Hi all!

The following code (executed in console)...
somevar <- data.frame(v1 = 1:5, somestring = 6:10, v3 = 11:15, v4 = 16:20);
somevar %>% gather(key = var, value = val, which(names(somevar) == 
"somestring"):length(somevar)) %>% head(2);
throws...
Error in which(names(somevar) == "somestring") :
  could not find function "which"

if I change which(names(somevar) == "somestring") with 0 I'll get
   Error in length(somevar) :
  could not find function "length"

So it looks like base package is not loaded. Still if type 'which' in console I 
get
  function (x, arr.ind = FALSE, useNames = TRUE)
  {
wh <- .Internal(which(x))
if (arr.ind && !is.null(d <- dim(x)))
arrayInd(wh, d, dimnames(x), useNames = useNames)
else wh
  }
  
  

base (that contains which function) package is installed. R version is 3.4.1 
and system is Win8

Where should I look to understand how to fix the problem?

Thank you in advance!
Eugeny


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pull data from Tally 9.1 to R studio

2017-08-24 Thread John Kane via R-help

IIt might help to read the material at one or both of these links 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Reproducibility · Advanced R.

| 
| 
|  | 
Reproducibility · Advanced R.


 |

 |

 |




On Thursday, August 24, 2017, 6:19:25 AM EDT, John Kane  
wrote:




On Thursday, August 24, 2017, 1:50:13 AM EDT, David Winsemius 
 wrote:


> On Aug 22, 2017, at 11:31 PM, jagan krishnan via R-help 
>  wrote:
> 
> Hi all,
> This is Jagan.i have been provided a task of analyzing sales data of a 
> company in R programming...Just wanted to know,how can I pull Tally 9.1 
> software data into R programming dataframe.
> Waiting eagerly for your inputs.
> With Regards,Jagannathan Krishnan
> Sent from Yahoo Mail on Android 
> 
>  On Tue, Aug 22, 2017 at 10:47 AM, jagan krishnan 
>wrote:  Hi all,
> This is Jagan.i have been provided a task of analyzing sales data of a 
> company in R programming...Just wanted to know,how can I pull Tally 9.1 
> software data into R programming dataframe.
> Waiting eagerly for your inputs.

Are we supposed to know what "Tally 9.1 software data" might look like?
Of course. Just download Tally 9.1, stick in some data  and you're away.
A quick google shows it is accounting software.


> With Regards,Jagannathan Krishnan
> 
> Sent from Yahoo Mail on Android  
> 
>     [[alternative HTML version deleted]]

A Posting Guide was prepared for you. I suggest that you should read it.

> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'  
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pull data from Tally 9.1 to R studio

2017-08-24 Thread John Kane via R-help



On Thursday, August 24, 2017, 1:50:13 AM EDT, David Winsemius 
 wrote:


> On Aug 22, 2017, at 11:31 PM, jagan krishnan via R-help 
>  wrote:
> 
> Hi all,
> This is Jagan.i have been provided a task of analyzing sales data of a 
> company in R programming...Just wanted to know,how can I pull Tally 9.1 
> software data into R programming dataframe.
> Waiting eagerly for your inputs.
> With Regards,Jagannathan Krishnan
> Sent from Yahoo Mail on Android 
> 
>  On Tue, Aug 22, 2017 at 10:47 AM, jagan krishnan 
>wrote:  Hi all,
> This is Jagan.i have been provided a task of analyzing sales data of a 
> company in R programming...Just wanted to know,how can I pull Tally 9.1 
> software data into R programming dataframe.
> Waiting eagerly for your inputs.

Are we supposed to know what "Tally 9.1 software data" might look like?
Of course. Just download Tally 9.1, stick in some data  and you're away.
A quick google shows it is accounting software.


> With Regards,Jagannathan Krishnan
> 
> Sent from Yahoo Mail on Android  
> 
>     [[alternative HTML version deleted]]

A Posting Guide was prepared for you. I suggest that you should read it.

> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'  
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange nlme augpred behaviour

2017-08-24 Thread David Winsemius

> On Aug 23, 2017, at 8:08 AM, PIKAL Petr  wrote:
> 
> Hi
> 
> Well, yes I tried it about two weeks ago but my post did not get through as 
> it still awaits moderator approval.

It got through just fine. It appeared on Aug 15. It just didn't get any replies.

As I read your original question in this thread, it was not clear to me that 
you had provided the data-object named "mar".

-- David.


> I could check which column is offending but actually it is only minor 
> nuisance, I can live with selection of columns before fitting a model. What 
> seems to me strange is that both full dataset and only selected colums gave 
> me identical fit results but only one works within augPred.
> 
> Cheers
> Petr
> 
>> -Original Message-
>> From: Bert Gunter [mailto:bgunter.4...@gmail.com]
>> Sent: Wednesday, August 23, 2017 4:50 PM
>> To: PIKAL Petr 
>> Cc: r-help mailing list 
>> Subject: Re: [R] strange nlme augpred behaviour
>> 
>> Better posted on r-sig-mixed-models , no?
>> 
>> Cheers,
>> Bert
>> 
>> 
>> Bert Gunter
>> 
>> "The trouble with having an open mind is that people keep coming along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> 
>> 
>> On Wed, Aug 23, 2017 at 5:17 AM, PIKAL Petr  wrote:
>>> Dear all
>>> 
>>> I encountered strange behaviour of augPred with virtually the same
>>> data
>>> 
>>> First I made groupedData object.
 mar.g<-groupedData(rutilizace~doba|int, data=mar)
>>> 
>>> When I perform nlme on complete dataset I get an error with augPred
 fit<-nlsList(rutilizace~SSasymp(doba, Asym, R0,  lrc), data=mar.g)
>>> Warning message:
>>> c("1 error caught in nls(y ~ cbind(1 - exp(-exp(lrc) * x), exp(-exp(lrc) * 
>>> x)), data
>> = xy, : singular gradient", "1 error caught in start = list(lrc = lrc), 
>> algorithm =
>> \"plinear\"): singular gradient")
 fit1<-nlme(fit)
 plot(augPred(fit1, level=0:1))
>>> Error in `[[<-.data.frame`(`*tmp*`, nm, value = c(6L, 6L, 6L, 6L, 8L,  :
>>>  replacement has 60 rows, data has 12
>>> 
>>> However when I make subset of my data to keep only affected collumns.
 
 mar.g<-mar.g[,c(3,4, 21)]
>>> 
 fit<-nlsList(rutilizace~SSasymp(doba, Asym, R0,  lrc), data=mar.g)
>>> Warning message:
>>> c("1 error caught in nls(y ~ cbind(1 - exp(-exp(lrc) * x), exp(-exp(lrc) * 
>>> x)), data
>> = xy, : singular gradient", "1 error caught in start = list(lrc = lrc), 
>> algorithm =
>> \"plinear\"): singular gradient")
 fit2<-nlme(fit)
 plot(augPred(fit2, level=0:1))
 
>>> augPred works as a charm.
>>> 
>>> When I compare fit1 and fit2 they are equal
 all.equal(fit1, fit2)
>>> [1] TRUE
 
>>> 
>>> Does anybody know where I should try to search for problems?
>>> 
>>> Best regards
>>> Petr
>>> 
 traceback()
>>> 6: stop(sprintf(ngettext(N, "replacement has %d row, data has %d",
>>>   "replacement has %d rows, data has %d"), N, nrows), domain =
>>> NA)
>>> 5: `[[<-.data.frame`(`*tmp*`, nm, value = c(1L, 1L, 1L, 1L, 5L,
>>>   5L, 5L, 5L, 9L, 9L, 9L, 9L, 4L, 4L, 4L, 4L, 8L, 8L, 8L, 8L, 12L,
>>>   12L, 12L, 12L, 3L, 3L, 3L, 3L, 7L, 7L, 7L, 7L, 11L, 11L, 11L,
>>>   11L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 9L, 9L, 9L, 9L, 2L, 2L,
>>>   2L, 2L, 6L, 6L, 6L, 6L, 10L, 10L, 10L, 10L))
>>> 4: `[[<-`(`*tmp*`, nm, value = c(1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L,
>>>   9L, 9L, 9L, 9L, 4L, 4L, 4L, 4L, 8L, 8L, 8L, 8L, 12L, 12L, 12L,
>>>   12L, 3L, 3L, 3L, 3L, 7L, 7L, 7L, 7L, 11L, 11L, 11L, 11L, 1L,
>>>   1L, 1L, 1L, 5L, 5L, 5L, 5L, 9L, 9L, 9L, 9L, 2L, 2L, 2L, 2L, 6L,
>>>   6L, 6L, 6L, 10L, 10L, 10L, 10L))
>>> 3: gsummary(data, groups = groups)
>>> 2: augPred.lme(fit1, level = 0:1)
>>> 1: augPred(fit1, level = 0:1)
>>> 
 version
>>>   _
>>> platform   x86_64-w64-mingw32
>>> arch   x86_64
>>> os mingw32
>>> system x86_64, mingw32
>>> status Under development (unstable)
>>> major  3
>>> minor  5.0
>>> year   2017
>>> month  07
>>> day31
>>> svn rev73003
>>> language   R
>>> version.string R Under development (unstable) (2017-07-31 r73003)
>>> nickname   Unsuffered Consequences
 
>>> 
>>> Package nlme version 3.1-131
>>> 
>>> 
>>> 
> .
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
> jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 

Re: [R] Getting all possible combinations

2017-08-24 Thread peter dalgaard

> On 24 Aug 2017, at 11:58 , peter dalgaard  wrote:
> 
> M <- as.matrix(do.call(expand.grid, rep(list(0:1),5)))
> M
> # This is just 0:31 encoded as (little-endian) binary
> apply(M, 1, function(i) sum((2^(0:4))[i]))
> 
> # now, use rows of M for logical indexing into "A"-"E"
> mode(M) <- "logical"


Whoops. That apply() line should either come _after_ conversion to logical, or 
read

apply(M, 1, function(i) sum(2^(0:4) * i))

As in:

> M <- as.matrix(do.call(expand.grid, rep(list(0:1),5)))
> apply(M, 1, function(i) sum(2^(0:4) * i))
 [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
[26] 25 26 27 28 29 30 31


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting all possible combinations

2017-08-24 Thread peter dalgaard

> On 24 Aug 2017, at 11:58 , peter dalgaard  wrote:
> 
> apply(M, 1, function(i) sum((2^(0:4))[i]))

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting all possible combinations

2017-08-24 Thread peter dalgaard

> On 24 Aug 2017, at 01:25 , Duncan Murdoch  wrote:
> 
> On 23/08/2017 6:25 PM, Bert Gunter wrote:
>> Doesn't sort by size of subgroup. I interpret the phrase I asterisked as:
> 
> You were fooled by Peter's tricky single negative.
> 

... 

Let's do this more carefully, then:

M <- as.matrix(do.call(expand.grid, rep(list(0:1),5)))
M
# This is just 0:31 encoded as (little-endian) binary
apply(M, 1, function(i) sum((2^(0:4))[i]))

# now, use rows of M for logical indexing into "A"-"E"
mode(M) <- "logical"
(l <- apply(M,1,function(i)LETTERS[1:5][i]))

# Clearly not sorted by group size:
#[[1]]
#character(0)
#
#[[2]]
#[1] "A"
#
#[[3]]
#[1] "B"
#
#[[4]]
#[1] "A" "B"
#
#[[5]]
#[1] "C"

# requires this to sort:

o <- order(sapply(l, length))
l[o]





-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boot.stepAIC fails with computed formula

2017-08-24 Thread Stephen O'hagan
OK, I will try that.

I notified the maintainer of boot.stepAIC, so he might fix this in due course. 

Thanks,
SGO.

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Heinz Tuechler
Sent: 24 August 2017 00:07
To: r-help@r-project.org
Subject: Re: [R] boot.stepAIC fails with computed formula

It seems that if you build the formula as a character string, and postpone the 
"as.formula" into the lm call, it works.

instead of
frm1 <- as.formula(paste(trg,"~1"))
use
frm1a <- paste(trg,"~1")
and then
strt <- lm(as.formula(frm1a),dat)

regards,

Heinz

Stephen O'hagan wrote/hat geschrieben on/am 23.08.2017 12:07:
> Until I get a fix that works, a work-around would be to rename the 'y1' 
> column, used a fixed formula, and rename it back afterwards.
>
> Thanks for your help.
> SGO.
>
> -Original Message-
> From: Bert Gunter [mailto:bgunter.4...@gmail.com]
> Sent: 22 August 2017 20:38
> To: Stephen O'hagan 
> Cc: r-help@r-project.org
> Subject: Re: [R] boot.stepAIC fails with computed formula
>
> OK, here's the problem. Continuing with your example:
>
> strt1 <- lm(y1 ~1, dat)
> strt2 <- lm(frm1,dat)
>
>
>> strt1
>
> Call:
> lm(formula = y1 ~ 1, data = dat)
>
> Coefficients:
> (Intercept)
>   41.73
>
>> strt2
>
> Call:
> lm(formula = frm1, data = dat)
>
> Coefficients:
> (Intercept)
>   41.73
>
>
> Note that the formula objects of the lm object are different: strt2 does not 
> evaluate the formula. So presumably boot.step.AIC does no evaluation and 
> therefore gets confused with the errors you saw. So you need to get the 
> evaluated formula into the lm object. This can be done, e.g. via:
>
>> strt2 <- eval(substitute(lm(form,data = dat), list(form = frm1)))
>
> ## yielding
>
>> strt2
>
> Call:
> lm(formula = y1 ~ 1, data = dat)
>
> Coefficients:
> (Intercept)
>   41.73
>
> So this looks like it should fix the problem, but alas no, the boot.stepAIC 
> call still fails with the same error message. Here's why:
>
>> identical(strt$call, strt2$call)
> [1] FALSE
>
> So one might rightfully ask, what the heck is going on here?! Further digging:
>
>> str(strt$call)
>  language lm(formula = y1 ~ 1, data = dat)
>
>> str(strt2$call)
>  language lm(formula = y1 ~ 1, data = dat)
>
> These certainly look identical! -- but of course they're not:
>
>> names(strt$call)
> [1] """formula" "data"
>> names(strt2$call)
> [1] """formula" "data"
>
> So the difference must lie in the formula component, right? ...
>
>> strt$call$formula
> y1 ~ 1
>> strt2$call$formula
> y1 ~ 1
>
> So, thus far, huhh? But..
>
>> class(strt2$call$formula)
> [1] "formula"
>
>> class(strt$call$formula)
> [1] "call"
>
> So I think therein lies the critical difference that is screwing things up. 
> NOTE: If I am wrong about this someone **PLEASE** correct me.
>
> I see no clear workaround for this other than to explicitly avoid
> passing a formula in the lm() call with y~1 or y ~ .   I think the
> real fix is to make the  boot.stepAIC function smarter in how it handles its 
> formula argument, and that is above my paygrade (and degree of interest) . 
> You should probably email the maintainer, who may not monitor this list. But 
> give it a day or so to give someone else a chance to correct me if I'm wrong.
>
>
> HTH.
>
> Cheers,
>
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan  
> wrote:
>> I'm trying to use boot.stepAIC for feature selection; I need to be able to 
>> specify the name of the dependent variable programmatically, but this appear 
>> to fail:
>>
>> In R-Studio with MS R Open 3.4:
>>
>> library(bootStepAIC)
>>
>> #Fake data
>> n<-200
>>
>> x1 <- runif(n, -3, 3)
>> x2 <- runif(n, -3, 3)
>> x3 <- runif(n, -3, 3)
>> x4 <- runif(n, -3, 3)
>> x5 <- runif(n, -3, 3)
>> x6 <- runif(n, -3, 3)
>> x7 <- runif(n, -3, 3)
>> x8 <- runif(n, -3, 3)
>> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>>
>> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
>> #the real data won't have these names...
>>
>> cn <- names(dat)
>> trg <- "y1"
>> xvars <- cn[cn!=trg]
>>
>> frm1<-as.formula(paste(trg,"~1"))
>> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>>
>> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>>
>> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>>
>> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>>
>> limit<-5
>>
>>
>> stp=stepAIC(strt,direction='forward',steps=limit,
>> scope=list(lower=frm1,upper=frm2))
>>
>> bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>> scope=list(lower=frm1,upper=frm2))
>>
>> b1 <- bst$Covariates
>> ball <- data.frame(b1)
>> names(ball)=unlist(trg)
>>
>> Any ideas?
>>
>> Cheers,
>> SOH
>>
>>
>> [[alternative 

[R] Problem in optimization of Gaussian Mixture model

2017-08-24 Thread niharika singhal
Hello,


I am facing a problem with optimization in R from 2-3 weeks.



I have some Gaussian mixtures parameters and I want to find the maximum in
that



*Parameters are in the form *

mean1mean2mean3   sigma1   sigma2   sigma3   c1c2c3

506.8644 672.8448 829.902 61.02859 9.149168 74.84682 0.1241933
0.6329082 0.2428986





I have used optima and optimx to find the maxima, but it gives me value
near by the highest mean as an output, for example 830 in the above
parameters. The code for my optim function is

val1=*optim*(par=c(X=1000, *seg=1*),fn=xnorm,  opt=c(NA,seg=1),
method="BFGS",lower=-Inf,upper=+Inf, control=list(fnscale=-1))

*I am running the optim function in a loop and for different initial value
and taking the val1$par[1] as the best value.*

*I am taking this parameter since I want to run it on n number of segments
latter*

and xnorm is simply calculating the dnorm

*xnorm*=function(param, opt = rep(NA, 2)){

  if (any(!sapply(opt, is.na))) {

i = !sapply(opt, is.na)

# Fix non-NA values

param[i] <- opt[i]

  }

  xval= param[1]

seg <- param[2]

  sum_prob=0

  val=0

l=3

meanval=c(506.8644, 672.8448, 829.902)

sigmaval=c(61.02859, 9.149168, 74.84682)

coeffval(0.1241933, 0.6329082, 0.2428986)

  for(n in 1 :l)

  {

mu=meanval[seg,n]

sg=sigmaval[seg,n]

cval=coeffval[seg,n]

val=cval*(dnorm(xval,mu,sg))

#print(paste0("The dnorm value for x is.: ", val))

sum_prob=sum_prob+val

  }

  sum_prob

}


The output is not correct. Since I check my data using*
UnivarMixingDistribution* from distr package and according to this the max
should lie somewhere between 600-800

Code I used to check

mc0=c( 0.6329082,0.6329082,0.2428986)

rv
<-UnivarMixingDistribution(Norm(506.8644,61.02859),Norm(672.8448,9.149168),Norm(
829.902,74.84682), mixCoeff=mc0/sum(mc0))

plot(rv, to.draw.arg="d")



Can someone please help how I can solve this problem?


Thanks & Regards

Niharika Singhal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing 2 dale columns

2017-08-24 Thread PIKAL Petr
Hm.

I used the code from Mark's response and it works correctly. So you either 
messed the data or you do not tell us the whole story.

data
> dput(data)
structure(list(COL1 = structure(c(16222, 16252), class = "Date"),
COL2 = structure(c(16556, 16556), class = "Date"), Date_Flag = c(0,
0)), .Names = c("COL1", "COL2", "Date_Flag"), row.names = c(NA,
-2L), class = "data.frame")

Instead of ifelse you can use

data$Date_Flag_calc <- (data$COL2 < data$COL1)*1
data
COL1   COL2 Date_Flag Date_Flag_calc
1 2014-06-01 2015-05-01 0  0
2 2014-07-01 2015-05-01 0  0


What is result of

str(data)
in your case?


Cheers
Petr

From: Patrick Casimir [mailto:patrc...@nova.edu]
Sent: Wednesday, August 23, 2017 6:10 PM
To: PIKAL Petr ; r-help@r-project.org
Subject: Re: Comparing 2 dale columns




Thanks. But when I apply your codes I get all NA instead of TRUE and FALSE


From: PIKAL Petr >
Sent: Wednesday, August 23, 2017 11:20:00 AM
To: Patrick Casimir; r-help@r-project.org
Subject: RE: Comparing 2 dale columns

Hi

your code is wrong.

I get
> test<-read.table("clipboard", header=T)
> str(test)
'data.frame':   2 obs. of  2 variables:
 $ COL1: Factor w/ 2 levels "6/1/14","7/1/14": 1 2
 $ COL2: Factor w/ 1 level "5/1/15": 1 1
> test$COL2<- as.Date(as.character(test$COL2, format="%y/%m/%d"))
> test$COL1<- as.Date(as.character(test$COL1, format="%y/%m/%d"))

 ^^^
incorrect parentheses position, wrong y,m,d

Using correct syntax I get correct result.

> test$COL2<- as.Date(test$COL2, format="%d/%m/%y")
> test$COL1<- as.Date(test$COL1, format="%d/%m/%y")
>
> test$COL2 > test$COL1
[1] TRUE TRUE
> test
COL1   COL2
1 2014-01-06 2015-01-05
2 2014-01-07 2015-01-05
>

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick
> Casimir
> Sent: Wednesday, August 23, 2017 4:54 PM
> To: r-help@r-project.org
> Subject: [R] Comparing 2 dale columns
>
> Dear R fellows,
>
>
> I created a new column Date_flag to compare the dates of COL1 and COL2
> using the code below. But it showed that 5/1/15 is greater than 6/1/2014 and
> 5/1/2015 greater than
> 7/1/2014 despite the year is greater. How do I fix that? I did try to format 
> as
> %y/%m/%d
>
>  but it does not fix that.
>
>
> data$Date_Flag <- ifelse(data$COL2 > data$COL1, 0,1)
>
>
> COL1   COL2
> 6/1/14 5/1/15
> 7/1/14 5/1/15
>
>
> data$COL2<- as.Date(as.character(data$COL2, format="%y/%m/%d"))
> data$COL1<- as.Date(as.character(data$COL1, format="%y/%m/%d"))
>
>


Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny 
pouze jeho adres?t?m.
Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? 
jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze 
sv?ho syst?mu.
Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email 
jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat.
Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i 
zpo?d?n?m p?enosu e-mailu.

V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?:
- vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a 
to z jak?hokoliv d?vodu i bez uveden? d?vodu.
- a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; 
Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce 
s dodatkem ?i odchylkou.
- trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m 
dosa?en?m shody na v?ech jej?ch n?le?itostech.
- odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost 
??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? 
pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? 
osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi 
?i osob? j?m zastoupen? zn?m?.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any 

Re: [R] Vibration signal prediction in R

2017-08-24 Thread Barry Rowlingson
On Thu, Aug 24, 2017 at 7:07 AM, David Winsemius 
wrote:

>
> > On Aug 23, 2017, at 10:06 PM, Dhivya Narayanasamy 
> wrote:
> >
> > I have a vibration signal coming accelerometer. I converted this signal
> from*
> > m/s^2* to *mm/s*. Now I am supposed to predict this vibration signal in R
> > using historical data. (Please see the attached picture of vibration
> > signal).
>
> The dimensional analysis of the second sentence does not make scientific
> sense.
>
>
Not so! If you know the initial v and a(t) you can compute v(t) by
integration.

For example if you drop something then v(0)=0, a(t)=-g, then you can
compute v(t)

The question has been posted to Data Science Stack Exchange if anyone wants
to see the picture...

https://datascience.stackexchange.com/questions/22466/processing-the-vibration-signal-before-prediction

Barry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vibration signal prediction in R

2017-08-24 Thread David Winsemius

> On Aug 23, 2017, at 10:06 PM, Dhivya Narayanasamy  
> wrote:
> 
> I have a vibration signal coming accelerometer. I converted this signal from*
> m/s^2* to *mm/s*. Now I am supposed to predict this vibration signal in R
> using historical data. (Please see the attached picture of vibration
> signal).

The dimensional analysis of the second sentence does not make scientific sense.

The parenthetical sentence might be true but if so you failed. Probably because 
you did not pay sufficient attention to the information describing the 
requirements for posting to this mailing list.
> 
> Can I use this vibration signal just like that to the prediction using
> predictive Models or Do i need to do some processing technique before doing
> prediction?  Are there any blogs or websites for this vibration prediction
> using R ?

On rhew first page of a google search for "prediction time domain with R" I see:

http://facebookincubator.github.io/prophet/

> All i could found was blogs using "fft" function on vibration in
> R.

Something like this?
"Vibration Spectral Analysis to Predict Asset Failures by Integrating R in SAS® 
Asset Performance Analytics"

http://support.sas.com/resources/papers/proceedings17/SAS0527-2017.pdf


> but My aim is to do *prediction of vibration signal in time domain*
> using packages in R.

Have you searched on other plausible terms? Say "reliability analysis"?

> 
> I am interested to know how this signal can be used in prediction in R? Any
> help is much appreciated.
> 
> Thank you.
> Dhivya
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Please read the foollowing links.

> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.