Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Eugene
I still need the output to match my requiremnt in my original post.  With 
decision rules "clusters" and probability attached to them.  The examples are 
sort of similar.  You just provided links to general info about trees.


Sent from my Verizon, Samsung Galaxy smartphone

 Original message 
From: Sarah Goslee  Date: 
4/13/16  8:04 PM  (GMT-06:00) To: Michael Artz 
 Cc: "r-help@r-project.org" 
 Subject: Re: [R] Decision Tree and Random 
Forrest 

On Wednesday, April 13, 2016, Michael Artz  wrote:

> Tjats great that you are familiar and thanks for responding.  Have you
> ever done what I am referring to? I have alteady spent time going through
> links and tutorials about decision trees and random forrests and have even
> used them both before.
>
Then what specifically is your problem? Both of the tutorials I provided
show worked examples, as does even the help for rpart. If none of those, or
your extensive reading, work for your project you will have to be a lot
more specific about why not.

Sarah



> Mike
> On Apr 13, 2016 5:32 PM, "Sarah Goslee"  > wrote:
>
> It sounds like you want classification or regression trees. rpart does
> exactly what you describe.
>
> Here's an overview:
> http://www.statmethods.net/advstats/cart.html
>
> But there are a lot of other ways to do the same thing in R, for instance:
> http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
>
> You can get the same kind of information from random forests, but it's
> less straightforward. If you want a clear set of rules as in your golf
> example, then you need rpart or similar.
>
> Sarah
>
> On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz  > wrote:
> > Ah yes I will have to use the predict function.  But the predict function
> > will not get me there really.  If I can take the example that I have a
> > model predicting whether or not I will play golf (this is the dependent
> > value), and there are three independent variables Humidity(High, Medium,
> > Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind
> (High,
> > Low).  I would like rules like where any record that follows these rules
> > (IF humidity = high AND pending_chores = None AND Wind = High THEN 77%
> > there is probability that play_golf is YES).  I was thinking that random
> > forrest would weight the rules somehow on the collection of trees and
> give
> > a probability.  But if that doesnt make sense, then can you just tell me
> > how to get the decsion rules with one tree and I will work from that.
> >
> > Mike
> >
> > Mike
> >
> > On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter  > wrote:
> >
> >> I think you are missing the point of random forests. But if you just
> >> want to predict using the forest, there is a predict() method that you
> >> can use. Other than that, I certainly don't understand what you mean.
> >> Maybe someone else might.
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz  >
> >> wrote:
> >> > Ok is there a way to do  it with decision tree?  I just need to make
> the
> >> > decision rules. Perhaps I can pick one of the trees used with Random
> >> > Forrest.  I am somewhat familiar already with Random Forrest with
> >> respective
> >> > to bagging and feature sampling and getting the mode from the leaf
> nodes
> >> and
> >> > it being an ensemble technique of many trees.  I am just working from
> the
> >> > perspective that I need decision rules, and I am working backward form
> >> that,
> >> > and I need to do it in R.
> >> >
> >> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter  >
> >> wrote:
> >> >>
> >> >> Nope.
> >> >>
> >> >> Random forests are not decision trees -- they are ensembles (forests)
> >> >> of trees. You need to go back and read up on them so you understand
> >> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of
> >> >> Statistical Learning" has a nice explanation, but I'm sure there are
> >> >> lots of good web resources, too.
> >> >>
> >> >> Cheers,
> >> >> Bert
> >> >>
> >> >>
> >> >> Bert Gunter
> >> >>
>
>

--
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing 

Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Sarah Goslee
On Wednesday, April 13, 2016, Michael Artz  wrote:

> Tjats great that you are familiar and thanks for responding.  Have you
> ever done what I am referring to? I have alteady spent time going through
> links and tutorials about decision trees and random forrests and have even
> used them both before.
>
Then what specifically is your problem? Both of the tutorials I provided
show worked examples, as does even the help for rpart. If none of those, or
your extensive reading, work for your project you will have to be a lot
more specific about why not.

Sarah



> Mike
> On Apr 13, 2016 5:32 PM, "Sarah Goslee"  > wrote:
>
> It sounds like you want classification or regression trees. rpart does
> exactly what you describe.
>
> Here's an overview:
> http://www.statmethods.net/advstats/cart.html
>
> But there are a lot of other ways to do the same thing in R, for instance:
> http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
>
> You can get the same kind of information from random forests, but it's
> less straightforward. If you want a clear set of rules as in your golf
> example, then you need rpart or similar.
>
> Sarah
>
> On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz  > wrote:
> > Ah yes I will have to use the predict function.  But the predict function
> > will not get me there really.  If I can take the example that I have a
> > model predicting whether or not I will play golf (this is the dependent
> > value), and there are three independent variables Humidity(High, Medium,
> > Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind
> (High,
> > Low).  I would like rules like where any record that follows these rules
> > (IF humidity = high AND pending_chores = None AND Wind = High THEN 77%
> > there is probability that play_golf is YES).  I was thinking that random
> > forrest would weight the rules somehow on the collection of trees and
> give
> > a probability.  But if that doesnt make sense, then can you just tell me
> > how to get the decsion rules with one tree and I will work from that.
> >
> > Mike
> >
> > Mike
> >
> > On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter  > wrote:
> >
> >> I think you are missing the point of random forests. But if you just
> >> want to predict using the forest, there is a predict() method that you
> >> can use. Other than that, I certainly don't understand what you mean.
> >> Maybe someone else might.
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz  >
> >> wrote:
> >> > Ok is there a way to do  it with decision tree?  I just need to make
> the
> >> > decision rules. Perhaps I can pick one of the trees used with Random
> >> > Forrest.  I am somewhat familiar already with Random Forrest with
> >> respective
> >> > to bagging and feature sampling and getting the mode from the leaf
> nodes
> >> and
> >> > it being an ensemble technique of many trees.  I am just working from
> the
> >> > perspective that I need decision rules, and I am working backward form
> >> that,
> >> > and I need to do it in R.
> >> >
> >> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter  >
> >> wrote:
> >> >>
> >> >> Nope.
> >> >>
> >> >> Random forests are not decision trees -- they are ensembles (forests)
> >> >> of trees. You need to go back and read up on them so you understand
> >> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of
> >> >> Statistical Learning" has a nice explanation, but I'm sure there are
> >> >> lots of good web resources, too.
> >> >>
> >> >> Cheers,
> >> >> Bert
> >> >>
> >> >>
> >> >> Bert Gunter
> >> >>
>
>

-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Tjats great that you are familiar and thanks for responding.  Have you ever
done what I am referring to? I have alteady spent time going through links
and tutorials about decision trees and random forrests and have even used
them both before.

Mike
On Apr 13, 2016 5:32 PM, "Sarah Goslee"  wrote:

It sounds like you want classification or regression trees. rpart does
exactly what you describe.

Here's an overview:
http://www.statmethods.net/advstats/cart.html

But there are a lot of other ways to do the same thing in R, for instance:
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/

You can get the same kind of information from random forests, but it's
less straightforward. If you want a clear set of rules as in your golf
example, then you need rpart or similar.

Sarah

On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz 
wrote:
> Ah yes I will have to use the predict function.  But the predict function
> will not get me there really.  If I can take the example that I have a
> model predicting whether or not I will play golf (this is the dependent
> value), and there are three independent variables Humidity(High, Medium,
> Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind
(High,
> Low).  I would like rules like where any record that follows these rules
> (IF humidity = high AND pending_chores = None AND Wind = High THEN 77%
> there is probability that play_golf is YES).  I was thinking that random
> forrest would weight the rules somehow on the collection of trees and give
> a probability.  But if that doesnt make sense, then can you just tell me
> how to get the decsion rules with one tree and I will work from that.
>
> Mike
>
> Mike
>
> On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter 
wrote:
>
>> I think you are missing the point of random forests. But if you just
>> want to predict using the forest, there is a predict() method that you
>> can use. Other than that, I certainly don't understand what you mean.
>> Maybe someone else might.
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz 
>> wrote:
>> > Ok is there a way to do  it with decision tree?  I just need to make
the
>> > decision rules. Perhaps I can pick one of the trees used with Random
>> > Forrest.  I am somewhat familiar already with Random Forrest with
>> respective
>> > to bagging and feature sampling and getting the mode from the leaf
nodes
>> and
>> > it being an ensemble technique of many trees.  I am just working from
the
>> > perspective that I need decision rules, and I am working backward form
>> that,
>> > and I need to do it in R.
>> >
>> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter 
>> wrote:
>> >>
>> >> Nope.
>> >>
>> >> Random forests are not decision trees -- they are ensembles (forests)
>> >> of trees. You need to go back and read up on them so you understand
>> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of
>> >> Statistical Learning" has a nice explanation, but I'm sure there are
>> >> lots of good web resources, too.
>> >>
>> >> Cheers,
>> >> Bert
>> >>
>> >>
>> >> Bert Gunter
>> >>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Sarah Goslee
It sounds like you want classification or regression trees. rpart does
exactly what you describe.

Here's an overview:
http://www.statmethods.net/advstats/cart.html

But there are a lot of other ways to do the same thing in R, for instance:
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/

You can get the same kind of information from random forests, but it's
less straightforward. If you want a clear set of rules as in your golf
example, then you need rpart or similar.

Sarah

On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz  wrote:
> Ah yes I will have to use the predict function.  But the predict function
> will not get me there really.  If I can take the example that I have a
> model predicting whether or not I will play golf (this is the dependent
> value), and there are three independent variables Humidity(High, Medium,
> Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind (High,
> Low).  I would like rules like where any record that follows these rules
> (IF humidity = high AND pending_chores = None AND Wind = High THEN 77%
> there is probability that play_golf is YES).  I was thinking that random
> forrest would weight the rules somehow on the collection of trees and give
> a probability.  But if that doesnt make sense, then can you just tell me
> how to get the decsion rules with one tree and I will work from that.
>
> Mike
>
> Mike
>
> On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter  wrote:
>
>> I think you are missing the point of random forests. But if you just
>> want to predict using the forest, there is a predict() method that you
>> can use. Other than that, I certainly don't understand what you mean.
>> Maybe someone else might.
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz 
>> wrote:
>> > Ok is there a way to do  it with decision tree?  I just need to make the
>> > decision rules. Perhaps I can pick one of the trees used with Random
>> > Forrest.  I am somewhat familiar already with Random Forrest with
>> respective
>> > to bagging and feature sampling and getting the mode from the leaf nodes
>> and
>> > it being an ensemble technique of many trees.  I am just working from the
>> > perspective that I need decision rules, and I am working backward form
>> that,
>> > and I need to do it in R.
>> >
>> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter 
>> wrote:
>> >>
>> >> Nope.
>> >>
>> >> Random forests are not decision trees -- they are ensembles (forests)
>> >> of trees. You need to go back and read up on them so you understand
>> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of
>> >> Statistical Learning" has a nice explanation, but I'm sure there are
>> >> lots of good web resources, too.
>> >>
>> >> Cheers,
>> >> Bert
>> >>
>> >>
>> >> Bert Gunter
>> >>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Ah yes I will have to use the predict function.  But the predict function
will not get me there really.  If I can take the example that I have a
model predicting whether or not I will play golf (this is the dependent
value), and there are three independent variables Humidity(High, Medium,
Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind (High,
Low).  I would like rules like where any record that follows these rules
(IF humidity = high AND pending_chores = None AND Wind = High THEN 77%
there is probability that play_golf is YES).  I was thinking that random
forrest would weight the rules somehow on the collection of trees and give
a probability.  But if that doesnt make sense, then can you just tell me
how to get the decsion rules with one tree and I will work from that.

Mike

Mike

On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter  wrote:

> I think you are missing the point of random forests. But if you just
> want to predict using the forest, there is a predict() method that you
> can use. Other than that, I certainly don't understand what you mean.
> Maybe someone else might.
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz 
> wrote:
> > Ok is there a way to do  it with decision tree?  I just need to make the
> > decision rules. Perhaps I can pick one of the trees used with Random
> > Forrest.  I am somewhat familiar already with Random Forrest with
> respective
> > to bagging and feature sampling and getting the mode from the leaf nodes
> and
> > it being an ensemble technique of many trees.  I am just working from the
> > perspective that I need decision rules, and I am working backward form
> that,
> > and I need to do it in R.
> >
> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter 
> wrote:
> >>
> >> Nope.
> >>
> >> Random forests are not decision trees -- they are ensembles (forests)
> >> of trees. You need to go back and read up on them so you understand
> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of
> >> Statistical Learning" has a nice explanation, but I'm sure there are
> >> lots of good web resources, too.
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz 
> >> wrote:
> >> > Hi I'm trying to get the top decision rules from a decision tree.
> >> > Eventually I will like to do this with R and Random Forrest.  There
> has
> >> > to
> >> > be a way to output the decsion rules of each leaf node in an easily
> >> > readable way. I am looking at the randomforrest and rpart packages
> and I
> >> > dont see anything yet.
> >> > Mike
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] R igraph

2016-04-13 Thread Javier Marcuzzi
Estimados

Hace unos días por sugerencia de Luisfo Chiroque, utilicé esta opción:

datos.simple <- simplify(udatos, edge.attr.comb = list(weight="sum”))

Se me ocurrió mirar el “weight” para conocer cuándo daba la suma, para hacerlo 
fácil dentro de un data.frame de la siguiente forma: 
head(get.data.frame(udatos.simple))
from to Descripcion B weight  structure(c("1", "1", 
"1", "1", "1", "1"), class = "AsIs")
1 Ficha 1022 Mes 10  NULL   NULL NULL   
 
Warning message:
In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs

  1
Y encontré este problema (la parte que leen en rojo).

Me llama la atención porque al resto de los procesos R los trabajo sin 
problemas, hasta hice el gráfico. Pero como en este último veo algunos nodos 
que tienen un tamaño grande respecto a otros se me pensé en mirar el data.frame 
con los pesos, ordenarlos de mayor a menor, y analizar que pasa por esa 
diferencias que observo en el gráfico.

¿Alguna sugerencia?

Otra cosa que me llama la atención es lo siguiente: 

De algunos ejemplos …

# Collapse multiple links of the same type between the same two nodes
# by summing their weights, using aggregate() by "from", "to", & "type":
links <- aggregate(links[,3], links[,-3], sum)
links <- links[order(links$from, links$to),]
colnames(links)[4] <- "weight"
rownames(links) <- NULL

Versus
# g4 has two edges going from Jim to Jack, and a loop from John to himself.
# We can simplify our graph to remove loops & multiple edges between the same 
nodes.
# Use 'edge.attr.comb' to indicate how edge attributes are to be combined - 
possible 
# options include "sum", "mean", "prod" (product), min, max, first/last 
(selects 
# the first/last edge's attribute). Option "ignore" says the attribute should 
be 
# disregarded and dropped.

g4s <- simplify( g4, remove.multiple = T, remove.loops = F, 
 edge.attr.comb=list(weight="sum", type="ignore") )

Algunos sugieren  aggregate, porque dicen que simplify podría tomar deciciónes, 
por decirlo de alguna forma, pero la opción de aggregate en mis pruebas me 
“confunde” en los resultados que obtengo. Entiendo que de la columna tercera, 
cuándo hay un elemento repetido, lo sume, luego descarto este valor (el de la 
columna tercera) quedando solo la suma calculada. Lo que en mi cabeza, 
mentalmente es como simplify, de la columna deseada aplique la función suma, 
resumiendo.

¿Habré comprendido bien?

Javier Rubén Marcuzzi

De: Javier Marcuzzi
Enviado: viernes, 1 de abril de 2016 12:56
Para: Luisfo Chiroque
CC: r-help-es@r-project.org
Asunto: Re: [R-es] R igraph

Estimado Luisfo Chiroque

Muchas, gracias, creo que lo entendí, en estos momentos no puedo probarlo como 
para decir "ya está". 

Para comentarle, desconozco si fastgreedy es la función que necesito, pero como 
mi objetivo es realizar un gráfico donde agrupo elementos que están 
relacionados, tendré mucha prueba y error hasta que encuentre una forma 
entendible visualmente para los no estadísticos, y con números estadísticos que 
avalen esas relaciones y agrupaciones.

Muchas gracias

Javier Rubén Marcuzzi

El 1 de abril de 2016, 10:03, Luisfo Chiroque  escribió:
Estimado Javier,

El problema de simplify es que no sabe cómo mezclar las aristas a no ser que se 
lo indiques explícitamente.
No sé si por defecto se quedará con la primera o la última arista.
En cualquier caso, como parece que para tu objetivo esto es algo crítico, tiene 
remedio.
Tú sólo quieres calcular fastgreedey.community pero teniendo en cuenta si 
existen más de una arista entre dos nodos. Esta función tiene en cuenta los 
pesos si existe una variable ‘weight’.
1) Añade una variable weight a tu grafo, a todas las aristas, con peso 1
E(udatos)$weight <- 1
2) Simplifica el grafo. Por defecto, simplify suma las variables weight, si 
existen.
udatos.simple <- simplify(udatos, edge.attr.comb = list(weight="sum”))
Pero puedes añadir la función que quieras:
udatos.simple <- simplify(udatos, edge.attr.comb = list(weight=function(w) {1 / 
sum(w)} ))
Dependiendo de si quieres dar un efecto positivo o negativo al hecho de que 
hayan más de una arista entre dos nodos.
De igual forma, podrías añadir funciones específicas para que simplify sepa 
cómo combinar atributos de aristas repetidas:
udatos.simple <- simplify(udatos, edge.attr.comb = list(weight="sum", 
"Descripcion A"=function(descr) {…}, “DescripcionB"=function(descr) {...}))
3) Ejecutas fastgreedy.community
fastgreedy.community(udatos.simple)
Si tuvieras que usar una relación de pesos más compleja, siempre la puedes 
indicar explícitamente en la función:
fastgreedy.community(udatos.simple, weights = weights.vector)
donde weights.vector es es un vector de valores, de tamaño 
ecount(udatos.simple); un valor por arista.

Espero que esto te sea de ayuda y solucione tu problema.

Un 

Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Bert Gunter
I think you are missing the point of random forests. But if you just
want to predict using the forest, there is a predict() method that you
can use. Other than that, I certainly don't understand what you mean.
Maybe someone else might.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz  wrote:
> Ok is there a way to do  it with decision tree?  I just need to make the
> decision rules. Perhaps I can pick one of the trees used with Random
> Forrest.  I am somewhat familiar already with Random Forrest with respective
> to bagging and feature sampling and getting the mode from the leaf nodes and
> it being an ensemble technique of many trees.  I am just working from the
> perspective that I need decision rules, and I am working backward form that,
> and I need to do it in R.
>
> On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter  wrote:
>>
>> Nope.
>>
>> Random forests are not decision trees -- they are ensembles (forests)
>> of trees. You need to go back and read up on them so you understand
>> how they work. The Hastie/Tibshirani/Friedman "The Elements of
>> Statistical Learning" has a nice explanation, but I'm sure there are
>> lots of good web resources, too.
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz 
>> wrote:
>> > Hi I'm trying to get the top decision rules from a decision tree.
>> > Eventually I will like to do this with R and Random Forrest.  There has
>> > to
>> > be a way to output the decsion rules of each leaf node in an easily
>> > readable way. I am looking at the randomforrest and rpart packages and I
>> > dont see anything yet.
>> > Mike
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Also that being said, just because random forest are not the same thing as
decision trees does not mean that you can't get decision rules from random
forest.

On Wed, Apr 13, 2016 at 4:11 PM, Michael Artz 
wrote:

> Ok is there a way to do  it with decision tree?  I just need to make the
> decision rules. Perhaps I can pick one of the trees used with Random
> Forrest.  I am somewhat familiar already with Random Forrest with
> respective to bagging and feature sampling and getting the mode from the
> leaf nodes and it being an ensemble technique of many trees.  I am just
> working from the perspective that I need decision rules, and I am working
> backward form that, and I need to do it in R.
>
> On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter 
> wrote:
>
>> Nope.
>>
>> Random forests are not decision trees -- they are ensembles (forests)
>> of trees. You need to go back and read up on them so you understand
>> how they work. The Hastie/Tibshirani/Friedman "The Elements of
>> Statistical Learning" has a nice explanation, but I'm sure there are
>> lots of good web resources, too.
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz 
>> wrote:
>> > Hi I'm trying to get the top decision rules from a decision tree.
>> > Eventually I will like to do this with R and Random Forrest.  There has
>> to
>> > be a way to output the decsion rules of each leaf node in an easily
>> > readable way. I am looking at the randomforrest and rpart packages and I
>> > dont see anything yet.
>> > Mike
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Ok is there a way to do  it with decision tree?  I just need to make the
decision rules. Perhaps I can pick one of the trees used with Random
Forrest.  I am somewhat familiar already with Random Forrest with
respective to bagging and feature sampling and getting the mode from the
leaf nodes and it being an ensemble technique of many trees.  I am just
working from the perspective that I need decision rules, and I am working
backward form that, and I need to do it in R.

On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter  wrote:

> Nope.
>
> Random forests are not decision trees -- they are ensembles (forests)
> of trees. You need to go back and read up on them so you understand
> how they work. The Hastie/Tibshirani/Friedman "The Elements of
> Statistical Learning" has a nice explanation, but I'm sure there are
> lots of good web resources, too.
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz 
> wrote:
> > Hi I'm trying to get the top decision rules from a decision tree.
> > Eventually I will like to do this with R and Random Forrest.  There has
> to
> > be a way to output the decsion rules of each leaf node in an easily
> > readable way. I am looking at the randomforrest and rpart packages and I
> > dont see anything yet.
> > Mike
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Bert Gunter
Nope.

Random forests are not decision trees -- they are ensembles (forests)
of trees. You need to go back and read up on them so you understand
how they work. The Hastie/Tibshirani/Friedman "The Elements of
Statistical Learning" has a nice explanation, but I'm sure there are
lots of good web resources, too.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz  wrote:
> Hi I'm trying to get the top decision rules from a decision tree.
> Eventually I will like to do this with R and Random Forrest.  There has to
> be a way to output the decsion rules of each leaf node in an easily
> readable way. I am looking at the randomforrest and rpart packages and I
> dont see anything yet.
> Mike
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reduced set of alternatives in package mlogit

2016-04-13 Thread John Kane
To back up Ber's please have a look at
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
 and/or http://adv-r.had.co.nz/Reproducibility.html

John Kane
Kingston ON Canada


> -Original Message-
> From: jose.ferr...@logiteng.com
> Sent: Wed, 13 Apr 2016 17:18:35 +
> To: cdesj...@umn.edu
> Subject: Re: [R] reduced set of alternatives in package mlogit
> 
> 
> 
> code? example data?  We can only guess based on your vague post.
> 
> "PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code."
> 
> Moreover, this sounds like a statistical question, not a question about R
> programming, and so might be more appropriate for a statistical list like
> stats.stackexchange.com  .
> 
> Cheers,
> Bert
> 
> 
> Bert Gunter
> 
> 
> 
> Sorry if I was not clear enough, but  there is hardly any code to show.
> The problem is that a parameter or function is lacking (or , mostly
> likely, I can't find it), so in some sense the problem itself is that
> there is no code to show.
> 
> In what follows choice situations , alternatives, wide, and variables
> have the same meaning that they have on the mlogit documentation. All
> variables are alternative specific.
> 
> 1)I want to estimate a multinomial Logit  using the mlogit package
> 
> 2)I have a dataset, made of choice situations
> 
> 3)There is a set of alternatives
> 
> 4)in some choice situations, not all alternatives were available, but
> only a subset of them. So there are no variables for the unavailable
> alternatives and the chosen alternative evidently  belongs to the set of
> available ones.
> 
> 5)I use mlogit.data to prepare the dataset from a "wide" dataframe .
> There is no option to have only a subset of alternatives and the
> resulting object will have them all , that is, there will be a line for
> every alternative and every choice situation, even if in reality some of
> them were not available. The variables of these alternatives did not
> exist, so must be filled with 0s or any other made up value
> 
> 6) If ones estimate a model from this data it will be wrong
> 
> 7) It is possible to get an "almost right" model by using a dummy
> variable marking which alternatives are unavailable, for as it is only
> used in alternatives that are never chosen, its coefficient will get
> negative with big absolute value, in practice giving almost 0%
> probability for them
> 
> 8)this is a workaround because it obligates the model to estimate a
> number that should be -infinity and this is known in advance, so it's
> ugly and difficult to know what the numeric consequences are as the
> coefficient can never converge. In fact, I don't use it the way I
> described for these reasons, preferring a more complex but almost
> equivalent formulation. The important point is that I want a clean
> solution, not a workaround
> 
> 9)I demand simply if mlogit package has such functionality
> 
> 
> 
> 
> 
> Hi,
> I meant that in some choice situations there are some alternatives
> missing, but the available alternatives are known to everybody(both the
> one that made the choice as well as to who collected the data).
> For future reference, I would like to post here that I found the answer.
> Apparently it is not possible if one uses mlogit.data  with shape =
> “wide”, but it is if one uses it with shape = “long” .
> So basically one can create an alternative specific variable with
> availability (let’s call it is_avaliable) and use mlogit.data  normally
> that is :
> all_avaliable <- mlogit.data(df , shape = “wide” , …)
> then one can subset it
> real_avaliability <-  all_avaliable[all_avaliable$is_avaliable ,]
> and resend it through mlogit.data with format long
> mlogit.data(real_avaliability , shape = "long" , alt.var = "alt" ,
> chid.var = "chid", …)
> please observe that alt and chid will have been created by the first call
> to mlogit.data
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5
Capture screenshots, upload images, edit and send them to your friends
through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Hi I'm trying to get the top decision rules from a decision tree.
Eventually I will like to do this with R and Random Forrest.  There has to
be a way to output the decsion rules of each leaf node in an easily
readable way. I am looking at the randomforrest and rpart packages and I
dont see anything yet.
Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] on the output of constrOptim()

2016-04-13 Thread Christophe Dutang
Dear list,

The following example of constrOptim() where the initial point is the solution 
shows that the component counts is not a two-element vector as documented in 
the man page. 

constrOptim(c(1,1), fr, grr, ui = diag(2), ci = c(0,0))

Does anyone have the same behavior?

A possible solution is to put line 69 in constrOptim.R before the first 
possible break line 67.

Regards, Christophe


> sessionInfo()
R version 3.2.4 (2016-03-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] nlstools_1.0-2 fitdistrplus_1.0-7 survival_2.38-3MASS_7.3-45   

loaded via a namespace (and not attached):
[1] tools_3.2.4   splines_3.2.4
---
Christophe Dutang
LMM, UdM, Le Mans, France
web: http://dutangc.free.fr 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reduced set of alternatives in package mlogit

2016-04-13 Thread Jose Marcos Ferraro


code? example data?  We can only guess based on your vague post.

"PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code."

Moreover, this sounds like a statistical question, not a question about R 
programming, and so might be more appropriate for a statistical list like 
stats.stackexchange.com  .

Cheers,
Bert


Bert Gunter



Sorry if I was not clear enough, but  there is hardly any code to show.
The problem is that a parameter or function is lacking (or , mostly likely, I 
can't find it), so in some sense the problem itself is that  there is no code 
to show.

In what follows choice situations , alternatives, wide, and variables have the 
same meaning that they have on the mlogit documentation. All variables are 
alternative specific.

1)I want to estimate a multinomial Logit  using the mlogit package

2)I have a dataset, made of choice situations

3)There is a set of alternatives

4)in some choice situations, not all alternatives were available, but only a 
subset of them. So there are no variables for the unavailable alternatives and 
the chosen alternative evidently  belongs to the set of available ones.

5)I use mlogit.data to prepare the dataset from a "wide" dataframe . There is 
no option to have only a subset of alternatives and the resulting object will 
have them all , that is, there will be a line for every alternative and every 
choice situation, even if in reality some of them were not available. The 
variables of these alternatives did not exist, so must be filled with 0s or any 
other made up value

6) If ones estimate a model from this data it will be wrong

7) It is possible to get an "almost right" model by using a dummy variable 
marking which alternatives are unavailable, for as it is only used in 
alternatives that are never chosen, its coefficient will get negative with big 
absolute value, in practice giving almost 0% probability for them

8)this is a workaround because it obligates the model to estimate a number that 
should be -infinity and this is known in advance, so it's ugly and difficult to 
know what the numeric consequences are as the coefficient can never converge. 
In fact, I don't use it the way I described for these reasons, preferring a 
more complex but almost equivalent formulation. The important point is that I 
want a clean solution, not a workaround

9)I demand simply if mlogit package has such functionality





Hi,
I meant that in some choice situations there are some alternatives missing, but 
the available alternatives are known to everybody(both the one that made the 
choice as well as to who collected the data).
For future reference, I would like to post here that I found the answer.
Apparently it is not possible if one uses mlogit.data  with shape = “wide”, but 
it is if one uses it with shape = “long” .
So basically one can create an alternative specific variable with availability 
(let’s call it is_avaliable) and use mlogit.data  normally that is :
all_avaliable <- mlogit.data(df , shape = “wide” , …)
then one can subset it
real_avaliability <-  all_avaliable[all_avaliable$is_avaliable ,]
and resend it through mlogit.data with format long
mlogit.data(real_avaliability , shape = "long" , alt.var = "alt" ,  chid.var = 
"chid", …)
please observe that alt and chid will have been created by the first call to 
mlogit.data

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] R y Excel - paquete openxlsx

2016-04-13 Thread Francisco Rodríguez
Muchas gracias Isidro, para mi es de interes
Un saludo

From: ihida...@jccm.es
To: r-help-es@r-project.org
Date: Wed, 13 Apr 2016 16:58:36 +0200
Subject: [R-es] R y Excel - paquete openxlsx

Buenas tardes.
 
Alguna vez alguien ha preguntado por aqu� acerca de la conexi�n entre R y
Excel, y he recomendado el paquete "XLConnect".
 
Bien, este mail es para recomendar el paquete "openxlsx" en su lugar.
 
"XLConnect" est� basado en java y, cuando he tenido que trabajar con
vol�menes de informaci�n considerables, o con numerosos ficheros Excel, no
he conseguido cargar los datos. Por lo que he le�do, la culpa no es de R,
sino de la gesti�n de memoria de Java. Es posible incrementar los l�mites de
memoria en Java, pero no ha sido efectivo en mi caso.
 
"openxlsx" est� basado en C++ (depende del paquete Rcpp), no tiene problemas
de memoria (hasta donde lo he comprobado), es m�s r�pido y el c�digo
necesario para leer ficheros Excel, pero sobre todo para escribir en ellos,
es m�s sencillo.
 
La �nica "pega" es que hay que instalar RTools, pero como �ste se utiliza
para otras cosas, en el fondo matas dos p�jaros de un tiro.
 
Por si es de inter�s para alguien�
 
Saludos
 
 
 
Isidro Hidalgo Arellano
 
Observatorio del Mercado de Trabajo
 
Consejer�a de Econom�a, Empresas y Empleo
 
  http://www.castillalamancha.es/
 
 
 
 
 
 
[[alternative HTML version deleted]]
 

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es 
  
[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] formula argument evaluation

2016-04-13 Thread William Dunlap via R-help
%=>% would have precendence ('order of operations') problems also.

   A + B %=>% C

is equivalent to

  A + ( B %=>% C)

and I don't think that is what you want.

as.list(quote(A + B %=>% C)) shows the first branch in the parse tree.  The
following function, str.language, shows the entire parse tree, as in

  > str.language(quote(A + B %=>% C))
  `quote(A + B %=>% C)` call(3): A + B %=>% C
`` name(1): +
`` name(1): A
`` call(3): B %=>% C
  `` name(1): %=>%
  `` name(1): B
  `` name(1): C

str.language <-
function (object, ..., level = 0, name = myDeparse(substitute(object)))
{
abbr <- function(string, maxlen = 25) {
if (length(string) > 1 || nchar(string) > maxlen)
paste(substring(string[1], 1, maxlen), "...", sep = "")
else string
}
myDeparse <- function(object) {
if (!is.environment(object)) {
deparse(object)
}
else {
ename <- environmentName(object)
if (ename == "")
ename <- ""
paste(sep = "", "<", ename, "> ", paste(collapse = " ",
objects(object)))
}
}
cat(rep("  ", level), sep = "")
if (is.null(name))
name <- ""
cat(sprintf("`%s` %s(%d): %s\n", abbr(name), class(object),
length(object), abbr(myDeparse(object
a <- attributes(object)
if (is.recursive(object) && !is.environment(object)) {
object <- as.list(object)
names <- names(object)
for (i in seq_along(object)) {
str.language(object[[i]], ..., level = level + 1,
name = names[i])
}
}
a$names <- NULL
if (length(a) > 0) {
str.language(a, level = level + 1, name = paste("Attributes of",
abbr(name)))
}
}



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Apr 12, 2016 at 11:59 PM, Adrian Dușa  wrote:

> I suppose it would work, although "=>" is rather a descriptive symbol and
> less a function.
> But choosing between quoting:
> "A + B => C"
> and a regular function:
> A + B %=>% C
> probably quoting is the most straightforward, as the result of the foo()
> function has to be a string anyways (which is parsed by other functions).
>
> On Tue, Apr 12, 2016 at 6:20 PM, Richard M. Heiberger 
> wrote:
>
> > Would making it regular function %=>%, using "%" instead of quotes,
> > work for you?
> >
> > On Tue, Apr 12, 2016 at 11:09 AM, Adrian Dușa 
> > wrote:
> > > On Tue, Apr 12, 2016 at 2:08 PM, Duncan Murdoch <
> > murdoch.dun...@gmail.com>
> > > wrote:
> > >> [...]
> > >>
> > >> It never gets to evaluating it.  It is not a legal R statement, so the
> > > parser signals an error.
> > >> If you want to pass arbitrary strings to a function, you need to put
> > them
> > > in quotes.
> > >
> > > I see. I thought it was parsed inside the function, but if it's parsed
> > > before then quoting is the only option.
> > >
> > >
> > > To Keith: no, I mean it like this "A + B => C" which is translated as:
> > > "the union of A and B is sufficient for C" in set theoretic language.
> > >
> > > The "=>" operator means sufficiency, while "<=" means necessity.
> Quoting
> > > the expression is good enough, I was just curious if the quotes could
> be
> > > made redundant, somehow.
> > >
> > > Thank you both,
> > > Adrian
> > >
> > > --
> > > Adrian Dusa
> > > University of Bucharest
> > > Romanian Social Data Archive
> > > Soseaua Panduri nr.90
> > > 050663 Bucharest sector 5
> > > Romania
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Adrian Dusa
> University of Bucharest
> Romanian Social Data Archive
> Soseaua Panduri nr.90
> 050663 Bucharest sector 5
> Romania
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] R y Excel - paquete openxlsx

2016-04-13 Thread Isidro Hidalgo Arellano
Buenas tardes.

Alguna vez alguien ha preguntado por aqu� acerca de la conexi�n entre R y
Excel, y he recomendado el paquete "XLConnect".

Bien, este mail es para recomendar el paquete "openxlsx" en su lugar.

"XLConnect" est� basado en java y, cuando he tenido que trabajar con
vol�menes de informaci�n considerables, o con numerosos ficheros Excel, no
he conseguido cargar los datos. Por lo que he le�do, la culpa no es de R,
sino de la gesti�n de memoria de Java. Es posible incrementar los l�mites de
memoria en Java, pero no ha sido efectivo en mi caso.

"openxlsx" est� basado en C++ (depende del paquete Rcpp), no tiene problemas
de memoria (hasta donde lo he comprobado), es m�s r�pido y el c�digo
necesario para leer ficheros Excel, pero sobre todo para escribir en ellos,
es m�s sencillo.

La �nica "pega" es que hay que instalar RTools, pero como �ste se utiliza
para otras cosas, en el fondo matas dos p�jaros de un tiro.

Por si es de inter�s para alguien�

Saludos

 

Isidro Hidalgo Arellano

Observatorio del Mercado de Trabajo

Consejer�a de Econom�a, Empresas y Empleo

  http://www.castillalamancha.es/

 

 


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] R 3.2.4-revised is released

2016-04-13 Thread Peter Dalgaard
CRAN turned out to have structural issues with version numbers that are not of 
the x.y.z variety (some script break). I'm trying to find time to build a 3.2.5 
just to fix this up. Of course all standard procedures are broken as 3.3.0 is 
now in progress, so several things now need to be done manually, which is 
"tedious and error-prone" as the saying goes.

-pd

On 13 Apr 2016, at 09:49 , Patrick Connolly  wrote:

> My CRAN mirror still says this:
> 
>  The latest release (Thursday 2016-03-10, Very Secure Dishes)
>  R-3.2.4.tar.gz, read what's new in the latest version.
> 
> Should that not be updated?  Anyone who has not seen that post won't
> know to look further.
> 
> 
> On Wed, 16-Mar-2016 at 08:39PM +, Peter Dalgaard wrote:
> 
> |> The 3.2.4 release had two annoyances which we would rather not have
> |> in an "ultra-stable" release, designed to hang around for the
> |> duration of the 3.3 series. One was a relatively minor Makefile
> |> issue affecting system using R's bundled lzma library. The other,
> |> rather more serious, affected printing and formatting of POSIXlt
> |> objects, which would unpredictably get the Daylight Savings Time
> |> wrong.
> 
> 
> |> Accordingly a revised version has been created.
> |> 
> |> You can get the source code from
> |> 
> |> http://cran.r-project.org/src/base/R-3/R-3.2.4-revised.tar.gz
> |> 
> |> or wait for it to be mirrored at a CRAN site nearer to you.
> |> 
> |> Maintainers of binary versions are requested to rebuild their binaries 
> using the revised sources.
> |> 
> |> 
> |> For the R Core Team,
> |> 
> |> Peter Dalgaard
> |> 
> |> New md5 sums are
> |> 
> |> MD5 (NEWS) = b0b43ac87a5b5858098da065966551af
> |> MD5 (R-3/R-3.2.4-revised.tar.gz) = 552b0c8088bab08ca4188797b919a58f
> |> 
> |> The relevant NEWS file entry is
> |> 
> |>   BUG FIXES:
> |> 
> |> • format.POSIXlt() behaved wrongly, e.g.,
> |>   format(as.POSIXlt(paste0(1940:2000,"-01-01"), tz="CET"),
> |>   usetz=TRUE) ended in two "CEST" time formats.
> |> 
> |> -- 
> |> Peter Dalgaard, Professor,
> |> Center for Statistics, Copenhagen Business School
> |> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> |> Phone: (+45)38153501
> |> Office: A 4.23
> |> Email: pd@cbs.dk  Priv: pda...@gmail.com
> |> 
> |> ___
> |> r-annou...@r-project.org mailing list
> |> https://stat.ethz.ch/mailman/listinfo/r-announce
> |> __
> |> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> |> https://stat.ethz.ch/mailman/listinfo/r-help
> |> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> |> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
>   ___Patrick Connolly   
> {~._.~}   Great minds discuss ideas
> _( Y )_Average minds discuss events 
> (:_~*~_:)  Small minds discuss people  
> (_)-(_) . Eleanor Roosevelt
> 
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] No color in plotting

2016-04-13 Thread PIKAL Petr
Hi

Without some reproducible example you hardly get any answer.

if this works
library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg))
p + geom_point()

the problem is in your data.

If it does not, the problem is elsewhere, including broken R installation.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Michael
> Artz
> Sent: Wednesday, April 13, 2016 8:33 AM
> To: r-help@r-project.org
> Subject: [R] No color in plotting
>
> Hi I am having a problem with plot () and ggplot ().  When I call one of these
> functions, the plotting area starts to look as though it is working, but 
> nothijg
> ever is visible.  Unless it was a dendrogram.  Woth the bar chart, the 
> plotting
> area just had an x and y axis and nothing else. I tried a bar chart with 
> ggplot
> and i tried to plot a tree result from rpart ().  I couldnt see anything 
> plotted.
> Is there some way I should be troubleshooting this?  Im thinking its an R
> config I did or didnt do.  I really have no idea though.
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 3.2.4-revised is released

2016-04-13 Thread Patrick Connolly
My CRAN mirror still says this:

  The latest release (Thursday 2016-03-10, Very Secure Dishes)
  R-3.2.4.tar.gz, read what's new in the latest version.

Should that not be updated?  Anyone who has not seen that post won't
know to look further.


On Wed, 16-Mar-2016 at 08:39PM +, Peter Dalgaard wrote:

|> The 3.2.4 release had two annoyances which we would rather not have
|> in an "ultra-stable" release, designed to hang around for the
|> duration of the 3.3 series. One was a relatively minor Makefile
|> issue affecting system using R's bundled lzma library. The other,
|> rather more serious, affected printing and formatting of POSIXlt
|> objects, which would unpredictably get the Daylight Savings Time
|> wrong.


|> Accordingly a revised version has been created.
|> 
|> You can get the source code from
|> 
|> http://cran.r-project.org/src/base/R-3/R-3.2.4-revised.tar.gz
|> 
|> or wait for it to be mirrored at a CRAN site nearer to you.
|> 
|> Maintainers of binary versions are requested to rebuild their binaries using 
the revised sources.
|> 
|> 
|> For the R Core Team,
|> 
|> Peter Dalgaard
|> 
|> New md5 sums are
|> 
|> MD5 (NEWS) = b0b43ac87a5b5858098da065966551af
|> MD5 (R-3/R-3.2.4-revised.tar.gz) = 552b0c8088bab08ca4188797b919a58f
|> 
|> The relevant NEWS file entry is
|> 
|>   BUG FIXES:
|> 
|> • format.POSIXlt() behaved wrongly, e.g.,
|>   format(as.POSIXlt(paste0(1940:2000,"-01-01"), tz="CET"),
|>   usetz=TRUE) ended in two "CEST" time formats.
|> 
|> -- 
|> Peter Dalgaard, Professor,
|> Center for Statistics, Copenhagen Business School
|> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
|> Phone: (+45)38153501
|> Office: A 4.23
|> Email: pd@cbs.dk  Priv: pda...@gmail.com
|> 
|> ___
|> r-annou...@r-project.org mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-announce
|> __
|> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] formula argument evaluation

2016-04-13 Thread Adrian Dușa
I suppose it would work, although "=>" is rather a descriptive symbol and
less a function.
But choosing between quoting:
"A + B => C"
and a regular function:
A + B %=>% C
probably quoting is the most straightforward, as the result of the foo()
function has to be a string anyways (which is parsed by other functions).

On Tue, Apr 12, 2016 at 6:20 PM, Richard M. Heiberger 
wrote:

> Would making it regular function %=>%, using "%" instead of quotes,
> work for you?
>
> On Tue, Apr 12, 2016 at 11:09 AM, Adrian Dușa 
> wrote:
> > On Tue, Apr 12, 2016 at 2:08 PM, Duncan Murdoch <
> murdoch.dun...@gmail.com>
> > wrote:
> >> [...]
> >>
> >> It never gets to evaluating it.  It is not a legal R statement, so the
> > parser signals an error.
> >> If you want to pass arbitrary strings to a function, you need to put
> them
> > in quotes.
> >
> > I see. I thought it was parsed inside the function, but if it's parsed
> > before then quoting is the only option.
> >
> >
> > To Keith: no, I mean it like this "A + B => C" which is translated as:
> > "the union of A and B is sufficient for C" in set theoretic language.
> >
> > The "=>" operator means sufficiency, while "<=" means necessity. Quoting
> > the expression is good enough, I was just curious if the quotes could be
> > made redundant, somehow.
> >
> > Thank you both,
> > Adrian
> >
> > --
> > Adrian Dusa
> > University of Bucharest
> > Romanian Social Data Archive
> > Soseaua Panduri nr.90
> > 050663 Bucharest sector 5
> > Romania
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>



-- 
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr.90
050663 Bucharest sector 5
Romania

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] could not find function in mempry inside foreach loop

2016-04-13 Thread cheng huimin
I'm trying to use foreach function to do multicore computing in R.



Error in FUN(train_adjmt, iter = missedmat[i, 1], iter2 = missedmat[i,  :
  task 1 failed - "找不到对象'predictMatrix'"

then I call function A in the console. The problem is I'm calling a
function Posdef inside B that is defined in another script file which I
source. I had to put predictMatrix in the list of export argument of foreach
: .export=c("predictMatrix"). However I get the following error:

Warning message:
In e$fun(obj, substitute(ex), parent.frame(), e$data) :
  already exporting variable(s): predictMatrix

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] could not find function in mempry inside foreach loop

2016-04-13 Thread cheng huimin
I'm trying to use foreach function to do multicore computing in R.

A <-function() {
foreach(i=1:10) %dopar% {
B()
}}

then I call function A in the console. The problem is I'm calling a
function  ipredictMatrix inside B that is defined in another script file
which I source.However I get the following error:

Error in FUN(train_adjmt, iter = missedmat[i, 1], iter2 = missedmat[i,  :
  task 1 failed - "找不到对象'predictMatrix'"

Then  I tried to put predictMatrix in the list of export argument of foreach
: .export=c("predictMatrix"). However I get the following error:

Warning message:
In e$fun(obj, substitute(ex), parent.frame(), e$data) :
  already exporting variable(s): predictMatrix

Why  can't R find this defined function? How to solve this problem?
Futermore ,R-version I am using is "3.2.3",and my computer is running on
windows.

Thanks,
Alice

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] No color in plotting

2016-04-13 Thread Michael Artz
Hi I am having a problem with plot () and ggplot ().  When I call one of
these functions, the plotting area starts to look as though it is working,
but nothijg ever is visible.  Unless it was a dendrogram.  Woth the bar
chart, the plotting area just had an x and y axis and nothing else. I tried
a bar chart with ggplot and i tried to plot a tree result from rpart ().  I
couldnt see anything plotted.  Is there some way I should be
troubleshooting this?  Im thinking its an R config I did or didnt do.  I
really have no idea though.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RWeka Error

2016-04-13 Thread ‪Rini John‬ ‪ via R-help
Hi,When I use any function of RWeka Package in Rstudio I get an error, "Error 
in .jnew (name): java.lang.ClassFormatError." can anyone guide me in 
this?Operation system used: Linux 64 bit (CentOS)
Command used: >data("crude")>tdm <- TermDocumentMatrix(crude, 
control=list(tokenize = NGramTokenizer))
Packages loaded: tm and RWeka

Regards,Rini John 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.