Re: [R] X axis labels are not fully shown when using pareto.chart function

2020-09-15 Thread Jim Lemon
Hi Paul,
It is a contest between the quantity of information and legibility.
You have 140 labels to place on the x-axis in your plot and I barely
managed it by almost doubling the width of the plot, halving the size
of the font and truncating the labels to a maximum of 20 characters.
Unless you want to follow something like Google Maps, where the
information springs up in a little balloon, you are close to the
limits of a static display. That said, you are correct in saying that
staxlab will cram a lot of information at the edges of a plot.

Jim

On Wed, Sep 16, 2020 at 11:36 AM Paul Bernal  wrote:
>
> Dear Jim,
>
> Thank you so much for your valuable and kind reply. I will try what you 
> suggest and will let you and the group members how that goes.
>
> Can the resolution you suggest be applied whenever I encounter an issue like 
> the one I just described (to make x-axis labels clearly visible)? I imagine 
> if work with some other x-axis labels I'd just have to play around with the 
> width, height, srt and cex parameters?
>
> Best regards,
>
> Paul
>
>
> El mar., 15 de septiembre de 2020 7:06 p. m., Jim Lemon 
>  escribió:
>>
>> Hi Paul,
>> This looks very familiar to me, but I'll send my previous suggestion.
>>
>> library(qcc)
>> x11(width=13,height=5)
>> pareto.chart(dataset2$Points,xaxt="n")
>> library(plotrix)
>> staxlab(1,at=seq(0.035,0.922,length.out=140),
>>  labels=substr(dataset2$School,1,20),srt=90,cex=0.5)
>>
>> Jim
>>
>> On Wed, Sep 16, 2020 at 2:53 AM Paul Bernal  wrote:
>> >
>> > Dear friends,
>> >
>> > Hope you are doing well. I am currently using R version 3.6.2. I installed
>> > and loaded package qcc by Mr. Luca Scrucca.
>> >
>> > Hopefully someone can tell me if there is a workaround for the issue I am
>> > experiencing.
>> >
>> > I generated the pareto chart using qcc´s pareto.chart function, but when
>> > the graph gets generated, the x-axis labels aren´t fully shown (some labels
>> > get truncated), and so the text can´t be viewed properly.
>> >
>> > Is there any way to adjust the x-axis labels (font and orientation), so
>> > that they can be easily shown?
>> >
>> > This is the structure of my data:
>> >
>> > str(dataset2)
>> > 'data.frame':   140 obs. of  2 variables:
>> >  $ School: Factor w/ 140 levels "24 de Diciembre",..: 39 29 66 16 67 116 35
>> > 106 65 17 ...
>> >  $ Points: num  55 43 24 21 20 20 18 17 16 16 ...
>> >
>> >
>> > Below is the dput() of my dataset.
>> >
>> > dput(dataset2)
>> > structure(list(School = structure(c(39L, 29L, 66L, 16L, 67L,
>> > 116L, 35L, 106L, 65L, 17L, 12L, 55L, 136L, 8L, 24L, 140L, 123L,
>> > 114L, 22L, 15L, 98L, 4L, 107L, 110L, 20L, 76L, 19L, 25L, 93L,
>> > 14L, 46L, 7L, 104L, 121L, 23L, 88L, 74L, 41L, 103L, 59L, 96L,
>> > 95L, 30L, 109L, 117L, 132L, 47L, 21L, 137L, 79L, 115L, 101L,
>> > 125L, 2L, 129L, 71L, 73L, 58L, 127L, 131L, 78L, 18L, 50L, 100L,
>> > 80L, 37L, 38L, 108L, 40L, 85L, 86L, 45L, 138L, 126L, 34L, 135L,
>> > 5L, 1L, 31L, 82L, 87L, 63L, 105L, 68L, 28L, 72L, 111L, 49L, 112L,
>> > 32L, 70L, 10L, 3L, 118L, 44L, 133L, 57L, 48L, 64L, 97L, 43L,
>> > 99L, 56L, 9L, 119L, 61L, 77L, 81L, 51L, 11L, 52L, 42L, 60L, 53L,
>> > 134L, 122L, 124L, 128L, 94L, 130L, 92L, 33L, 6L, 26L, 113L, 27L,
>> > 69L, 36L, 75L, 102L, 83L, 84L, 120L, 13L, 54L, 62L, 89L, 90L,
>> > 91L, 139L), .Label = c("24 de Diciembre", "Achiote", "Aguadulce",
>> > "Alcalde Díaz", "Alto Boquete", "Amador", "Amelia Denis de Icaza",
>> > "Ancón", "Antón", "Arnulfo Arias", "Arosemena", "Arraiján", "Bajo Boquete",
>> > "Barrio Balboa", "Barrio Colón", "Barrio Norte", "Barrio Sur",
>> > "Bejuco", "Belisario Frías", "Belisario Porras", "Bella Vista",
>> > "Betania", "Buena Vista", "Burunga", "Calidonia", "Cañaveral",
>> > "Canto del Llano", "Capira", "Cativá", "Cermeño", "Cerro Silvestre",
>> > "Chame", "Chepo", "Chicá", "Chilibre", "Chitré", "Ciricito",
>> > "Comarca Guna de Madugandí", "Cristóbal", "Cristóbal Este", "Curundú",
>> > "David", "Don Bosco", "El Arado", "El Caño", "El Chorrillo",
>> > "El Coco", "El Espino", "El Guabo", "El Harino", "El Higo", "El Llano",
>> > "El Roble", "El Valle", "Ernesto Córdoba Campos", "Escobal",
>> > "Feuillet", "Garrote o Puerto Lindo", "Guadalupe", "Herrera",
>> > "Hurtado", "Isla de Cañas", "Isla Grande", "Iturralde", "José Domingo
>> > Espinar",
>> > "Juan Demóstenes Arosemena", "Juan Díaz", "La Concepción", "La Ensenada",
>> > "La Laguna", "La Mesa", "La Raya de Calobre", "La Represa", "Las Cumbres",
>> > "Las Lajas", "Las Mañanitas", "Las Ollas Arriba", "Lídice", "Limón",
>> > "Los Díaz", "Los Llanitos", "María Chiquita", "Mateo Iturralde",
>> > "Miguel de la Borda", "Nombre de Dios", "Nueva Providencia",
>> > "Nuevo Chagres", "Nuevo Emperador", "Obaldía", "Ocú", "Olá",
>> > "Omar Torrijos", "Pacora", "Pajonal", "Palmas Bellas", "Parque Lefevre",
>> > "Pedasí", "Pedregal", "Penonomé", "Piña", "Playa Leona", "Pocrí",
>> > "Portobelo", "Pueblo Nuevo", "Puerto Armuelles", "Puerto Caimito",
>> > "Puerto Pilón", "Punta Chame", "Rio Abajo", 

Re: [R] X axis labels are not fully shown when using pareto.chart function

2020-09-15 Thread Paul Bernal
Dear Jim,

Thank you so much for your valuable and kind reply. I will try what you
suggest and will let you and the group members how that goes.

Can the resolution you suggest be applied whenever I encounter an issue
like the one I just described (to make x-axis labels clearly visible)? I
imagine if work with some other x-axis labels I'd just have to play around
with the width, height, srt and cex parameters?

Best regards,

Paul


El mar., 15 de septiembre de 2020 7:06 p. m., Jim Lemon <
drjimle...@gmail.com> escribió:

> Hi Paul,
> This looks very familiar to me, but I'll send my previous suggestion.
>
> library(qcc)
> x11(width=13,height=5)
> pareto.chart(dataset2$Points,xaxt="n")
> library(plotrix)
> staxlab(1,at=seq(0.035,0.922,length.out=140),
>  labels=substr(dataset2$School,1,20),srt=90,cex=0.5)
>
> Jim
>
> On Wed, Sep 16, 2020 at 2:53 AM Paul Bernal 
> wrote:
> >
> > Dear friends,
> >
> > Hope you are doing well. I am currently using R version 3.6.2. I
> installed
> > and loaded package qcc by Mr. Luca Scrucca.
> >
> > Hopefully someone can tell me if there is a workaround for the issue I am
> > experiencing.
> >
> > I generated the pareto chart using qcc´s pareto.chart function, but when
> > the graph gets generated, the x-axis labels aren´t fully shown (some
> labels
> > get truncated), and so the text can´t be viewed properly.
> >
> > Is there any way to adjust the x-axis labels (font and orientation), so
> > that they can be easily shown?
> >
> > This is the structure of my data:
> >
> > str(dataset2)
> > 'data.frame':   140 obs. of  2 variables:
> >  $ School: Factor w/ 140 levels "24 de Diciembre",..: 39 29 66 16 67 116
> 35
> > 106 65 17 ...
> >  $ Points: num  55 43 24 21 20 20 18 17 16 16 ...
> >
> >
> > Below is the dput() of my dataset.
> >
> > dput(dataset2)
> > structure(list(School = structure(c(39L, 29L, 66L, 16L, 67L,
> > 116L, 35L, 106L, 65L, 17L, 12L, 55L, 136L, 8L, 24L, 140L, 123L,
> > 114L, 22L, 15L, 98L, 4L, 107L, 110L, 20L, 76L, 19L, 25L, 93L,
> > 14L, 46L, 7L, 104L, 121L, 23L, 88L, 74L, 41L, 103L, 59L, 96L,
> > 95L, 30L, 109L, 117L, 132L, 47L, 21L, 137L, 79L, 115L, 101L,
> > 125L, 2L, 129L, 71L, 73L, 58L, 127L, 131L, 78L, 18L, 50L, 100L,
> > 80L, 37L, 38L, 108L, 40L, 85L, 86L, 45L, 138L, 126L, 34L, 135L,
> > 5L, 1L, 31L, 82L, 87L, 63L, 105L, 68L, 28L, 72L, 111L, 49L, 112L,
> > 32L, 70L, 10L, 3L, 118L, 44L, 133L, 57L, 48L, 64L, 97L, 43L,
> > 99L, 56L, 9L, 119L, 61L, 77L, 81L, 51L, 11L, 52L, 42L, 60L, 53L,
> > 134L, 122L, 124L, 128L, 94L, 130L, 92L, 33L, 6L, 26L, 113L, 27L,
> > 69L, 36L, 75L, 102L, 83L, 84L, 120L, 13L, 54L, 62L, 89L, 90L,
> > 91L, 139L), .Label = c("24 de Diciembre", "Achiote", "Aguadulce",
> > "Alcalde Díaz", "Alto Boquete", "Amador", "Amelia Denis de Icaza",
> > "Ancón", "Antón", "Arnulfo Arias", "Arosemena", "Arraiján", "Bajo
> Boquete",
> > "Barrio Balboa", "Barrio Colón", "Barrio Norte", "Barrio Sur",
> > "Bejuco", "Belisario Frías", "Belisario Porras", "Bella Vista",
> > "Betania", "Buena Vista", "Burunga", "Calidonia", "Cañaveral",
> > "Canto del Llano", "Capira", "Cativá", "Cermeño", "Cerro Silvestre",
> > "Chame", "Chepo", "Chicá", "Chilibre", "Chitré", "Ciricito",
> > "Comarca Guna de Madugandí", "Cristóbal", "Cristóbal Este", "Curundú",
> > "David", "Don Bosco", "El Arado", "El Caño", "El Chorrillo",
> > "El Coco", "El Espino", "El Guabo", "El Harino", "El Higo", "El Llano",
> > "El Roble", "El Valle", "Ernesto Córdoba Campos", "Escobal",
> > "Feuillet", "Garrote o Puerto Lindo", "Guadalupe", "Herrera",
> > "Hurtado", "Isla de Cañas", "Isla Grande", "Iturralde", "José Domingo
> > Espinar",
> > "Juan Demóstenes Arosemena", "Juan Díaz", "La Concepción", "La Ensenada",
> > "La Laguna", "La Mesa", "La Raya de Calobre", "La Represa", "Las
> Cumbres",
> > "Las Lajas", "Las Mañanitas", "Las Ollas Arriba", "Lídice", "Limón",
> > "Los Díaz", "Los Llanitos", "María Chiquita", "Mateo Iturralde",
> > "Miguel de la Borda", "Nombre de Dios", "Nueva Providencia",
> > "Nuevo Chagres", "Nuevo Emperador", "Obaldía", "Ocú", "Olá",
> > "Omar Torrijos", "Pacora", "Pajonal", "Palmas Bellas", "Parque Lefevre",
> > "Pedasí", "Pedregal", "Penonomé", "Piña", "Playa Leona", "Pocrí",
> > "Portobelo", "Pueblo Nuevo", "Puerto Armuelles", "Puerto Caimito",
> > "Puerto Pilón", "Punta Chame", "Rio Abajo", "Río Abajo", "Río Grande",
> > "Río Hato", "Río Indio", "Rufina Alfaro", "Sabanagrande", "Sabanitas",
> > "Sajalices", "Salamanca", "San Carlos", "San Felipe", "San Francisco",
> > "San José", "San Juan", "San Juan Bautista", "San Martín", "San Martín de
> > Porres",
> > "Santa Ana", "Santa Clara", "Santa Fe", "Santa Isabel", "Santa Rita",
> > "Santa Rosa", "Santiago", "Santiago Este", "Tinajas", "Tocumen",
> > "Veracruz", "Victoriano Lorenzo", "Villa Rosario", "Vista Alegre"
> > ), class = "factor"), Points = c(55, 43, 24, 21, 20, 20, 18,
> > 17, 16, 16, 15, 13, 13, 12, 12, 11, 11, 11, 11, 11, 10, 9, 9,
> > 9, 9, 9, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6,
> > 6, 

Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread Ana Marija
Hi David,

thanks for the useful insight I did of course wrote to plink user
group but no answer there. I guess they are more concerned about how
to run commands with plink as oppose to interpret results.

What I can tell about my cohort is that about 80% of cases had Type 2
diabetes while about 8% had Type 1. (my TD covariate is reference for
the type of diabetes) In the attach is the description of the data.

Cheers,
Ana

On Tue, Sep 15, 2020 at 7:59 PM David Winsemius  wrote:
>
>
> On 9/15/20 8:57 AM, Ana Marija wrote:
> > Hi Abby and David,
> >
> > Thanks for the useful tips! I will check those.
> >
> > I completed the regression analysis in plink (as R would be very slow
> > for my sample size) but as I mentioned I need to determine the
> > influence of a specific covariate in my results and Plink is of no
> > help there.
> >
> > I did Pearson correlation analysis for P values which I got in
> > regression with and without my covariate of interest and I got this:
> >
> >> cor.test(tt$P_TD, tt$P_noTD, method = "pearson", conf.level = 0.95)
> >  Pearson's product-moment correlation
> >
> > data:  tt$P_TD and tt$P_noTD
> > t = 20.17, df = 283, p-value < 2.2e-16
> > alternative hypothesis: true correlation is not equal to 0
> > 95 percent confidence interval:
> >   0.7156134 0.8117108
> > sample estimates:
> >cor
> > 0.7679493
> >
> > I can see the p values are very correlated in those two instances. Can
> > I conclude that my covariate then doesn't have a huge effect or what
> > kind of conclusion I can draw from that?
>
>
> I do not think it follows from the correlation of p-values that your
> covariate "does not have a huge effect". P-values are not really data,
> although they are random values. A simulation study of this would
> require a much better description of the original dataset. Again, that
> is something that the users of Plink are more likely to be able to
> intuit than are we. I still do not see why this question is not being
> addressed to the users of the software from which you are deriving your
> "data".
>
>
> --
>
> David.
>
> >
> > Thanks for all your help
> > Ana
> >
> >
> >
> > On Tue, Sep 15, 2020 at 1:26 AM David Winsemius  
> > wrote:
> >> There is a user-group for PLINK, easily found by looking at the page you
> >> cited. This is not the correct place to submit such questions.
> >>
> >>
> >> https://groups.google.com/g/plink2-users?pli=1
> >>
> >>
> >> --
> >>
> >> David.
> >>
> >> On 9/14/20 6:29 AM, Ana Marija wrote:
> >>> Hello,
> >>>
> >>> I was running association analysis using --glm genotypic from:
> >>> https://www.cog-genomics.org/plink/2.0/assoc with these covariates:
> >>> sex,age,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10,TD,array,HBA1C. The
> >>> result looks like this:
> >>>
> >>>   #CHROMPOSIDREFALTA1TESTOBS_CTBETA
> >>> SEZ_OR_F_STATPERRCODE
> >>>   10135434303rs11101905GAAADD11863
> >>> -0.1107330.0986981-1.121930.261891.
> >>>   10135434303rs11101905GAADOMDEV11863
> >>> 0.0797970.1110040.7188680.47.
> >>>   10135434303rs11101905GAAsex=Female
> >>> 11863-0.1204040.0536069-2.246050.0247006.
> >>>   10135434303rs11101905GAAage11863
> >>> 0.005245010.003915281.339630.180367.
> >>>   10135434303rs11101905GAAPC111863
> >>> -0.01917790.0166868-1.149280.25044.
> >>>   10135434303rs11101905GAAPC211863
> >>> -0.02699390.0173086-1.559570.118863.
> >>>   10135434303rs11101905GAAPC311863
> >>> 0.01152070.01680760.6854480.493061.
> >>>   10135434303rs11101905GAAPC411863
> >>> 9.57832e-050.01246070.00768680.993867.
> >>>   10135434303rs11101905GAAPC511863
> >>> -0.001910470.00543937-0.351230.725416.
> >>>   10135434303rs11101905GAAPC611863
> >>> -0.01033090.0159879-0.6461720.518168.
> >>>   10135434303rs11101905GAAPC711863
> >>> 0.007909970.01440250.5492070.582863.
> >>>   10135434303rs11101905GAAPC811863
> >>> -0.002056390.0142709-0.1440960.885424.
> >>>   10135434303rs11101905GAAPC911863
> >>> -0.008737710.0057239-1.526530.126878.
> >>>   10135434303rs11101905GAAPC1011863
> >>> 0.01161970.01238260.9383880.348045.
> >>>   10135434303rs11101905GAATD11863
> >>> -0.6700260.0962216-6.963373.32228e-12.
> >>>   10135434303rs11101905GAAarray=Biobank
> >>> 118630.160666

Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread David Winsemius



On 9/15/20 8:57 AM, Ana Marija wrote:

Hi Abby and David,

Thanks for the useful tips! I will check those.

I completed the regression analysis in plink (as R would be very slow
for my sample size) but as I mentioned I need to determine the
influence of a specific covariate in my results and Plink is of no
help there.

I did Pearson correlation analysis for P values which I got in
regression with and without my covariate of interest and I got this:


cor.test(tt$P_TD, tt$P_noTD, method = "pearson", conf.level = 0.95)

 Pearson's product-moment correlation

data:  tt$P_TD and tt$P_noTD
t = 20.17, df = 283, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
  0.7156134 0.8117108
sample estimates:
   cor
0.7679493

I can see the p values are very correlated in those two instances. Can
I conclude that my covariate then doesn't have a huge effect or what
kind of conclusion I can draw from that?



I do not think it follows from the correlation of p-values that your 
covariate "does not have a huge effect". P-values are not really data, 
although they are random values. A simulation study of this would 
require a much better description of the original dataset. Again, that 
is something that the users of Plink are more likely to be able to 
intuit than are we. I still do not see why this question is not being 
addressed to the users of the software from which you are deriving your 
"data".



--

David.



Thanks for all your help
Ana



On Tue, Sep 15, 2020 at 1:26 AM David Winsemius  wrote:

There is a user-group for PLINK, easily found by looking at the page you
cited. This is not the correct place to submit such questions.


https://groups.google.com/g/plink2-users?pli=1


--

David.

On 9/14/20 6:29 AM, Ana Marija wrote:

Hello,

I was running association analysis using --glm genotypic from:
https://www.cog-genomics.org/plink/2.0/assoc with these covariates:
sex,age,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10,TD,array,HBA1C. The
result looks like this:

  #CHROMPOSIDREFALTA1TESTOBS_CTBETA
SEZ_OR_F_STATPERRCODE
  10135434303rs11101905GAAADD11863
-0.1107330.0986981-1.121930.261891.
  10135434303rs11101905GAADOMDEV11863
0.0797970.1110040.7188680.47.
  10135434303rs11101905GAAsex=Female
11863-0.1204040.0536069-2.246050.0247006.
  10135434303rs11101905GAAage11863
0.005245010.003915281.339630.180367.
  10135434303rs11101905GAAPC111863
-0.01917790.0166868-1.149280.25044.
  10135434303rs11101905GAAPC211863
-0.02699390.0173086-1.559570.118863.
  10135434303rs11101905GAAPC311863
0.01152070.01680760.6854480.493061.
  10135434303rs11101905GAAPC411863
9.57832e-050.01246070.00768680.993867.
  10135434303rs11101905GAAPC511863
-0.001910470.00543937-0.351230.725416.
  10135434303rs11101905GAAPC611863
-0.01033090.0159879-0.6461720.518168.
  10135434303rs11101905GAAPC711863
0.007909970.01440250.5492070.582863.
  10135434303rs11101905GAAPC811863
-0.002056390.0142709-0.1440960.885424.
  10135434303rs11101905GAAPC911863
-0.008737710.0057239-1.526530.126878.
  10135434303rs11101905GAAPC1011863
0.01161970.01238260.9383880.348045.
  10135434303rs11101905GAATD11863
-0.6700260.0962216-6.963373.32228e-12.
  10135434303rs11101905GAAarray=Biobank
118630.1606660.0736312.182050.0291062.
  10135434303rs11101905GAAHBA1C11863
0.02659330.0016875815.75836.0236e-56.
  10135434303rs11101905GAAGENO_2DF11863
NANA0.7265140.483613.

This results is shown just for one ID (rs11101905) there is about 2
million of those in the resulting file.

My question is how do I present/plot the effect of covariate "TD" in
the example it has "P" equal to 3.32228e-12 for all IDs in the
resulting file so that I show how much effect covariate "TD" has on
the analysis. Should I run another regression without covariate "TD"
and than do scatter plot of P values with and without "TD" covariate
or there is a better way to do this from the data I already have?

Thanks
Ana

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

[R] Cross-posted was Re: Unnesting JSON using R

2020-09-15 Thread David Winsemius

Cross-posting is deprecated on r-help. Please don't do it again.


And it was already answered on StackOverflow.


--

David.

On 9/15/20 1:13 PM, Fred Kwebiha wrote:

Source=https://jsonformatter.org/e038ec

The above is nested json.

I want the output to be as below
dataElements.name,dataElements.id,categoryOptionCombos.name,categoryOptionCombos.id

Any help in r?
*Best Regards,*

*FRED KWEBIHA*
*+256-782-746-154*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] X axis labels are not fully shown when using pareto.chart function

2020-09-15 Thread Jim Lemon
Hi Paul,
This looks very familiar to me, but I'll send my previous suggestion.

library(qcc)
x11(width=13,height=5)
pareto.chart(dataset2$Points,xaxt="n")
library(plotrix)
staxlab(1,at=seq(0.035,0.922,length.out=140),
 labels=substr(dataset2$School,1,20),srt=90,cex=0.5)

Jim

On Wed, Sep 16, 2020 at 2:53 AM Paul Bernal  wrote:
>
> Dear friends,
>
> Hope you are doing well. I am currently using R version 3.6.2. I installed
> and loaded package qcc by Mr. Luca Scrucca.
>
> Hopefully someone can tell me if there is a workaround for the issue I am
> experiencing.
>
> I generated the pareto chart using qcc´s pareto.chart function, but when
> the graph gets generated, the x-axis labels aren´t fully shown (some labels
> get truncated), and so the text can´t be viewed properly.
>
> Is there any way to adjust the x-axis labels (font and orientation), so
> that they can be easily shown?
>
> This is the structure of my data:
>
> str(dataset2)
> 'data.frame':   140 obs. of  2 variables:
>  $ School: Factor w/ 140 levels "24 de Diciembre",..: 39 29 66 16 67 116 35
> 106 65 17 ...
>  $ Points: num  55 43 24 21 20 20 18 17 16 16 ...
>
>
> Below is the dput() of my dataset.
>
> dput(dataset2)
> structure(list(School = structure(c(39L, 29L, 66L, 16L, 67L,
> 116L, 35L, 106L, 65L, 17L, 12L, 55L, 136L, 8L, 24L, 140L, 123L,
> 114L, 22L, 15L, 98L, 4L, 107L, 110L, 20L, 76L, 19L, 25L, 93L,
> 14L, 46L, 7L, 104L, 121L, 23L, 88L, 74L, 41L, 103L, 59L, 96L,
> 95L, 30L, 109L, 117L, 132L, 47L, 21L, 137L, 79L, 115L, 101L,
> 125L, 2L, 129L, 71L, 73L, 58L, 127L, 131L, 78L, 18L, 50L, 100L,
> 80L, 37L, 38L, 108L, 40L, 85L, 86L, 45L, 138L, 126L, 34L, 135L,
> 5L, 1L, 31L, 82L, 87L, 63L, 105L, 68L, 28L, 72L, 111L, 49L, 112L,
> 32L, 70L, 10L, 3L, 118L, 44L, 133L, 57L, 48L, 64L, 97L, 43L,
> 99L, 56L, 9L, 119L, 61L, 77L, 81L, 51L, 11L, 52L, 42L, 60L, 53L,
> 134L, 122L, 124L, 128L, 94L, 130L, 92L, 33L, 6L, 26L, 113L, 27L,
> 69L, 36L, 75L, 102L, 83L, 84L, 120L, 13L, 54L, 62L, 89L, 90L,
> 91L, 139L), .Label = c("24 de Diciembre", "Achiote", "Aguadulce",
> "Alcalde Díaz", "Alto Boquete", "Amador", "Amelia Denis de Icaza",
> "Ancón", "Antón", "Arnulfo Arias", "Arosemena", "Arraiján", "Bajo Boquete",
> "Barrio Balboa", "Barrio Colón", "Barrio Norte", "Barrio Sur",
> "Bejuco", "Belisario Frías", "Belisario Porras", "Bella Vista",
> "Betania", "Buena Vista", "Burunga", "Calidonia", "Cañaveral",
> "Canto del Llano", "Capira", "Cativá", "Cermeño", "Cerro Silvestre",
> "Chame", "Chepo", "Chicá", "Chilibre", "Chitré", "Ciricito",
> "Comarca Guna de Madugandí", "Cristóbal", "Cristóbal Este", "Curundú",
> "David", "Don Bosco", "El Arado", "El Caño", "El Chorrillo",
> "El Coco", "El Espino", "El Guabo", "El Harino", "El Higo", "El Llano",
> "El Roble", "El Valle", "Ernesto Córdoba Campos", "Escobal",
> "Feuillet", "Garrote o Puerto Lindo", "Guadalupe", "Herrera",
> "Hurtado", "Isla de Cañas", "Isla Grande", "Iturralde", "José Domingo
> Espinar",
> "Juan Demóstenes Arosemena", "Juan Díaz", "La Concepción", "La Ensenada",
> "La Laguna", "La Mesa", "La Raya de Calobre", "La Represa", "Las Cumbres",
> "Las Lajas", "Las Mañanitas", "Las Ollas Arriba", "Lídice", "Limón",
> "Los Díaz", "Los Llanitos", "María Chiquita", "Mateo Iturralde",
> "Miguel de la Borda", "Nombre de Dios", "Nueva Providencia",
> "Nuevo Chagres", "Nuevo Emperador", "Obaldía", "Ocú", "Olá",
> "Omar Torrijos", "Pacora", "Pajonal", "Palmas Bellas", "Parque Lefevre",
> "Pedasí", "Pedregal", "Penonomé", "Piña", "Playa Leona", "Pocrí",
> "Portobelo", "Pueblo Nuevo", "Puerto Armuelles", "Puerto Caimito",
> "Puerto Pilón", "Punta Chame", "Rio Abajo", "Río Abajo", "Río Grande",
> "Río Hato", "Río Indio", "Rufina Alfaro", "Sabanagrande", "Sabanitas",
> "Sajalices", "Salamanca", "San Carlos", "San Felipe", "San Francisco",
> "San José", "San Juan", "San Juan Bautista", "San Martín", "San Martín de
> Porres",
> "Santa Ana", "Santa Clara", "Santa Fe", "Santa Isabel", "Santa Rita",
> "Santa Rosa", "Santiago", "Santiago Este", "Tinajas", "Tocumen",
> "Veracruz", "Victoriano Lorenzo", "Villa Rosario", "Vista Alegre"
> ), class = "factor"), Points = c(55, 43, 24, 21, 20, 20, 18,
> 17, 16, 16, 15, 13, 13, 12, 12, 11, 11, 11, 11, 11, 10, 9, 9,
> 9, 9, 9, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6,
> 6, 6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4,
> 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, -140L
> ), class = "data.frame")
>
> Best regards,
>
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, 

Re: [R] [R-pkgs] salesforcer v0.2.2: An Implementation of Salesforce APIs Using Tidy Principles

2020-09-15 Thread H
On September 13, 2020 6:42:18 AM EDT, "Steven M. Mortimer" 
 wrote:
>The {salesforcer} package allows users to query and analyze Salesforce
>data
>and administer their Org's records and object metadata (fields,
>triggers,
>layouts). It has been three years in the making to map multiple
>Salesforce Platform
>APIs for use in R. The package implements the REST, SOAP, Bulk 1.0,
>Bulk
>2.0, Metadata, and Reports and Dashboards APIs. If you or your
>colleagues
>maintain or analyze Salesforce data, then I would greatly appreciate
>your
>use and feedback of this package. Thank you.
>
>Sincerely,
>Steven M. Mortimer
>
>CRAN: https://cran.r-project.org/package=salesforcer
>GitHub: https://github.com/StevenMMortimer/salesforcer
>
>   [[alternative HTML version deleted]]
>
>___
>R-packages mailing list
>r-packa...@r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-packages
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

That sounds great. Is there any similar package that works with SuiteCRM or 
SugarCRM?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unnesting JSON using R

2020-09-15 Thread Fred Kwebiha
Source=https://jsonformatter.org/e038ec

The above is nested json.

I want the output to be as below
dataElements.name,dataElements.id,categoryOptionCombos.name,categoryOptionCombos.id

Any help in r?
*Best Regards,*

*FRED KWEBIHA*
*+256-782-746-154*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread Abby Spurdle
> My question is how do I present/plot the effect of covariate "TD" in
> the example it has "P" equal to 3.32228e-12 for all IDs in the
> resulting file so that I show how much effect covariate "TD" has on
> the analysis. Should I run another regression without covariate "TD"

I'll take a second shot in the dark:

There is R^2, and a number of generalizations.
(The most common of which, is probably adjusted R^2).
And there are various other goodness of fit tests.

https://en.wikipedia.org/wiki/Goodness_of_fit
https://en.wikipedia.org/wiki/Coefficient_of_determination

You could fit two models (one with a particular variable included, and
one without), and compare how the statistic changes.

However, I'm probably going to get told off, for going off-topic.
So, unless any further questions are specific to R programming, I
don't think I'm going to contribute further.

Also, I'd recommend you read some notes on statistical modelling, or
consult an expert, or both.
And I suspect there are additional considerations modelling genetic data.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question including crossover trials in meta-analysis

2020-09-15 Thread Belgers, V. (Vera)
Dear mr. Schwartz and mw Gunter,
Thank you both for your reply. I did google but did not find these sources, so 
thank you.
With kind regards,
Vera Belgers

Van: Bert Gunter 
Verzonden: maandag 14 september 2020 21:14
Aan: Belgers, V. (Vera) 
CC: r-help@R-project.org 
Onderwerp: Re: [R] question including crossover trials in meta-analysis

Did you first try a web search? -- you should always do this before posting 
here.

"meta-analysis in R" brought up this:

https://CRAN.R-project.org/view=MetaAnalysis

Have you looked at this task view yet?


Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Sep 14, 2020 at 11:50 AM Belgers, V. (Vera) 
mailto:v.belg...@amsterdamumc.nl>> wrote:
Dear sir/madam,
Thank you in advance for taking the time to read my question. I am currently 
trying to conduct a meta-analysis combining parallel and crossover trials. 
According to the Cochrane Handbook, I can include crossover trials by using 
t-paired statistics. So far, I have managed to conduct a meta-analysis and 
forest plot of the parallel trials using the dmetar package, but I did not 
succeed in including the crossover trials. I do have the raw data of most of 
these crossover trials.
Does anybody know how to add crossover trials to the meta-analysis?
With kind regards,
Vera Belgers
__
VUmc disclaimer : www.vumc.nl/disclaimer
AMC disclaimer : www.amc.nl/disclaimer

__
R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] X axis labels are not fully shown when using pareto.chart function

2020-09-15 Thread Paul Bernal
Dear friends,

Hope you are doing well. I am currently using R version 3.6.2. I installed
and loaded package qcc by Mr. Luca Scrucca.

Hopefully someone can tell me if there is a workaround for the issue I am
experiencing.

I generated the pareto chart using qcc´s pareto.chart function, but when
the graph gets generated, the x-axis labels aren´t fully shown (some labels
get truncated), and so the text can´t be viewed properly.

Is there any way to adjust the x-axis labels (font and orientation), so
that they can be easily shown?

This is the structure of my data:

str(dataset2)
'data.frame':   140 obs. of  2 variables:
 $ School: Factor w/ 140 levels "24 de Diciembre",..: 39 29 66 16 67 116 35
106 65 17 ...
 $ Points: num  55 43 24 21 20 20 18 17 16 16 ...


Below is the dput() of my dataset.

dput(dataset2)
structure(list(School = structure(c(39L, 29L, 66L, 16L, 67L,
116L, 35L, 106L, 65L, 17L, 12L, 55L, 136L, 8L, 24L, 140L, 123L,
114L, 22L, 15L, 98L, 4L, 107L, 110L, 20L, 76L, 19L, 25L, 93L,
14L, 46L, 7L, 104L, 121L, 23L, 88L, 74L, 41L, 103L, 59L, 96L,
95L, 30L, 109L, 117L, 132L, 47L, 21L, 137L, 79L, 115L, 101L,
125L, 2L, 129L, 71L, 73L, 58L, 127L, 131L, 78L, 18L, 50L, 100L,
80L, 37L, 38L, 108L, 40L, 85L, 86L, 45L, 138L, 126L, 34L, 135L,
5L, 1L, 31L, 82L, 87L, 63L, 105L, 68L, 28L, 72L, 111L, 49L, 112L,
32L, 70L, 10L, 3L, 118L, 44L, 133L, 57L, 48L, 64L, 97L, 43L,
99L, 56L, 9L, 119L, 61L, 77L, 81L, 51L, 11L, 52L, 42L, 60L, 53L,
134L, 122L, 124L, 128L, 94L, 130L, 92L, 33L, 6L, 26L, 113L, 27L,
69L, 36L, 75L, 102L, 83L, 84L, 120L, 13L, 54L, 62L, 89L, 90L,
91L, 139L), .Label = c("24 de Diciembre", "Achiote", "Aguadulce",
"Alcalde Díaz", "Alto Boquete", "Amador", "Amelia Denis de Icaza",
"Ancón", "Antón", "Arnulfo Arias", "Arosemena", "Arraiján", "Bajo Boquete",
"Barrio Balboa", "Barrio Colón", "Barrio Norte", "Barrio Sur",
"Bejuco", "Belisario Frías", "Belisario Porras", "Bella Vista",
"Betania", "Buena Vista", "Burunga", "Calidonia", "Cañaveral",
"Canto del Llano", "Capira", "Cativá", "Cermeño", "Cerro Silvestre",
"Chame", "Chepo", "Chicá", "Chilibre", "Chitré", "Ciricito",
"Comarca Guna de Madugandí", "Cristóbal", "Cristóbal Este", "Curundú",
"David", "Don Bosco", "El Arado", "El Caño", "El Chorrillo",
"El Coco", "El Espino", "El Guabo", "El Harino", "El Higo", "El Llano",
"El Roble", "El Valle", "Ernesto Córdoba Campos", "Escobal",
"Feuillet", "Garrote o Puerto Lindo", "Guadalupe", "Herrera",
"Hurtado", "Isla de Cañas", "Isla Grande", "Iturralde", "José Domingo
Espinar",
"Juan Demóstenes Arosemena", "Juan Díaz", "La Concepción", "La Ensenada",
"La Laguna", "La Mesa", "La Raya de Calobre", "La Represa", "Las Cumbres",
"Las Lajas", "Las Mañanitas", "Las Ollas Arriba", "Lídice", "Limón",
"Los Díaz", "Los Llanitos", "María Chiquita", "Mateo Iturralde",
"Miguel de la Borda", "Nombre de Dios", "Nueva Providencia",
"Nuevo Chagres", "Nuevo Emperador", "Obaldía", "Ocú", "Olá",
"Omar Torrijos", "Pacora", "Pajonal", "Palmas Bellas", "Parque Lefevre",
"Pedasí", "Pedregal", "Penonomé", "Piña", "Playa Leona", "Pocrí",
"Portobelo", "Pueblo Nuevo", "Puerto Armuelles", "Puerto Caimito",
"Puerto Pilón", "Punta Chame", "Rio Abajo", "Río Abajo", "Río Grande",
"Río Hato", "Río Indio", "Rufina Alfaro", "Sabanagrande", "Sabanitas",
"Sajalices", "Salamanca", "San Carlos", "San Felipe", "San Francisco",
"San José", "San Juan", "San Juan Bautista", "San Martín", "San Martín de
Porres",
"Santa Ana", "Santa Clara", "Santa Fe", "Santa Isabel", "Santa Rita",
"Santa Rosa", "Santiago", "Santiago Este", "Tinajas", "Tocumen",
"Veracruz", "Victoriano Lorenzo", "Villa Rosario", "Vista Alegre"
), class = "factor"), Points = c(55, 43, 24, 21, 20, 20, 18,
17, 16, 16, 15, 13, 13, 12, 12, 11, 11, 11, 11, 11, 10, 9, 9,
9, 9, 9, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6,
6, 6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, -140L
), class = "data.frame")

Best regards,

Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread Ana Marija
Hi Abby and David,

Thanks for the useful tips! I will check those.

I completed the regression analysis in plink (as R would be very slow
for my sample size) but as I mentioned I need to determine the
influence of a specific covariate in my results and Plink is of no
help there.

I did Pearson correlation analysis for P values which I got in
regression with and without my covariate of interest and I got this:

> cor.test(tt$P_TD, tt$P_noTD, method = "pearson", conf.level = 0.95)

Pearson's product-moment correlation

data:  tt$P_TD and tt$P_noTD
t = 20.17, df = 283, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.7156134 0.8117108
sample estimates:
  cor
0.7679493

I can see the p values are very correlated in those two instances. Can
I conclude that my covariate then doesn't have a huge effect or what
kind of conclusion I can draw from that?

Thanks for all your help
Ana



On Tue, Sep 15, 2020 at 1:26 AM David Winsemius  wrote:
>
> There is a user-group for PLINK, easily found by looking at the page you
> cited. This is not the correct place to submit such questions.
>
>
> https://groups.google.com/g/plink2-users?pli=1
>
>
> --
>
> David.
>
> On 9/14/20 6:29 AM, Ana Marija wrote:
> > Hello,
> >
> > I was running association analysis using --glm genotypic from:
> > https://www.cog-genomics.org/plink/2.0/assoc with these covariates:
> > sex,age,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10,TD,array,HBA1C. The
> > result looks like this:
> >
> >  #CHROMPOSIDREFALTA1TESTOBS_CTBETA
> >SEZ_OR_F_STATPERRCODE
> >  10135434303rs11101905GAAADD11863
> > -0.1107330.0986981-1.121930.261891.
> >  10135434303rs11101905GAADOMDEV11863
> > 0.0797970.1110040.7188680.47.
> >  10135434303rs11101905GAAsex=Female
> > 11863-0.1204040.0536069-2.246050.0247006.
> >  10135434303rs11101905GAAage11863
> > 0.005245010.003915281.339630.180367.
> >  10135434303rs11101905GAAPC111863
> > -0.01917790.0166868-1.149280.25044.
> >  10135434303rs11101905GAAPC211863
> > -0.02699390.0173086-1.559570.118863.
> >  10135434303rs11101905GAAPC311863
> > 0.01152070.01680760.6854480.493061.
> >  10135434303rs11101905GAAPC411863
> > 9.57832e-050.01246070.00768680.993867.
> >  10135434303rs11101905GAAPC511863
> > -0.001910470.00543937-0.351230.725416.
> >  10135434303rs11101905GAAPC611863
> > -0.01033090.0159879-0.6461720.518168.
> >  10135434303rs11101905GAAPC711863
> > 0.007909970.01440250.5492070.582863.
> >  10135434303rs11101905GAAPC811863
> > -0.002056390.0142709-0.1440960.885424.
> >  10135434303rs11101905GAAPC911863
> > -0.008737710.0057239-1.526530.126878.
> >  10135434303rs11101905GAAPC1011863
> > 0.01161970.01238260.9383880.348045.
> >  10135434303rs11101905GAATD11863
> > -0.6700260.0962216-6.963373.32228e-12.
> >  10135434303rs11101905GAAarray=Biobank
> > 118630.1606660.0736312.182050.0291062.
> >  10135434303rs11101905GAAHBA1C11863
> > 0.02659330.0016875815.75836.0236e-56.
> >  10135434303rs11101905GAAGENO_2DF11863
> >NANA0.7265140.483613.
> >
> > This results is shown just for one ID (rs11101905) there is about 2
> > million of those in the resulting file.
> >
> > My question is how do I present/plot the effect of covariate "TD" in
> > the example it has "P" equal to 3.32228e-12 for all IDs in the
> > resulting file so that I show how much effect covariate "TD" has on
> > the analysis. Should I run another regression without covariate "TD"
> > and than do scatter plot of P values with and without "TD" covariate
> > or there is a better way to do this from the data I already have?
> >
> > Thanks
> > Ana
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

[R] [R-pkgs] salesforcer v0.2.2: An Implementation of Salesforce APIs Using Tidy Principles

2020-09-15 Thread Steven M. Mortimer
The {salesforcer} package allows users to query and analyze Salesforce data
and administer their Org's records and object metadata (fields, triggers,
layouts). It has been three years in the making to map multiple
Salesforce Platform
APIs for use in R. The package implements the REST, SOAP, Bulk 1.0, Bulk
2.0, Metadata, and Reports and Dashboards APIs. If you or your colleagues
maintain or analyze Salesforce data, then I would greatly appreciate your
use and feedback of this package. Thank you.

Sincerely,
Steven M. Mortimer

CRAN: https://cran.r-project.org/package=salesforcer
GitHub: https://github.com/StevenMMortimer/salesforcer

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Interpretación de salida de un GLM

2020-09-15 Thread Rubén Fernández Casal
Hola a todos,

Veo que ya hay varias contestaciones mientras buscaba una respuesta a un
alumno que igual puede servirte:

El caso de los niveles de los factores es lo que vimos en la asignatura con
regresión con variables categóricas. Se incluyen coeficientes que miden el
efecto respecto a un nivel de referencia (por defecto el primero). Mira por
ejemplo:
https://rubenfcasal.github.io/intror/modelos-lineales.html#regresion-con-variables-categoricas

para el caso de regresión lineal.



En cuanto a la selección de modelos en principio podrías hacer lo mismo que
en el caso lineal, búsqueda exhaustiva o métodos por pasos. Las
herramientas que vimos para el caso lineal se pueden extender al caso de
modelos lineales generalizados: Mira por ejemplo:
https://rubenfcasal.github.io/intror/modelos-lineales-generalizados.html#seleccion-de-variables-explicativas-1



Mi recomendación sería que buscaras bibliografía para estudiar con más
detalle este tipo de modelos:

2006 - Extending The Linear Model With R. Generalized Linear, Mixed Effects
And Nonparametric Regression Models – Faraway

2019 -  An R Companion to Applied Regression - John Fox, Sanford Weisberg

(prueba a buscarlos aquí http://93.174.95.27/)


Por si también sirve...


Un saludo, Rubén.

P.D. Cualquier sugerencia para mejorar el material será bien recibida...

El mar., 15 sept. 2020 a las 11:13, Francisco Rodriguez Sanchez (<
f.rodriguez.s...@gmail.com>) escribió:

> Hola Juan,
>
> Primero de todo, no estoy seguro de si tu variable respuesta 'prop'
> representa la proporción de semillas germinadas? No es lo mismo que esa
> proporción sea 1 de 2 que 50 de 100, aunque el porcentaje de germinación
> sea el mismo (50%). Ver
>
> https://stats.stackexchange.com/questions/241983/using-proportions-directly-instead-of-cbind-in-glm-binomial-regression-is-th.
>
>
>
> Mi opción preferida para tener en cuenta el tamaño de muestra es
> proporcionar el número de semillas germinadas y el de no germinadas como
> variable respuesta, esto es
>
> glm(cbind(germin, nogermin) ~ condicion etc, family = binomial)
>
> En cuanto a la interpretación de los parámetros, en tu modelo el
> intercept representaría la probabilidad de germinación (en escala logit)
> cuando la condición es a y HFe es 0. Para interpretar un modelo así con
> interacciones creo que lo mejor es visualizarlo, por ejemplo usando el
> paquete effects o visreg. Aquí tengo algunos ejemplos
> (https://github.com/Pakillo/LM-GLM-GLMM-intro/blob/trees/glm_binomial.pdf),
>
> pero hay mucha más información en internet.
>
> Para obtener los valores estimados de germinación en distintas
> condiciones de humedad y estratificación, creo que lo más fácil es usar
> la función predict. Le pasas un data frame con los valores de humedad y
> estratificación y obtienes la probabilidad de germinación, con su
> incertidumbre. Si usaste HFe como predictor, debes mantener la misma
> escala. Si te interesa que el modelo se aplique en el futuro a otros
> datos, mejor usar HF tal cual o usar puntos de referencia fijos (p. ej.
> HF = 1000 horas) como propone Carlos.
>
> Espero que sirva. Suerte
>
> Paco
>
>
> On 14/9/20 21:44, Juan Seco Lopez wrote:
> > Estimada comunidad, tengo unas dudas que son muy básicas creo, pero es mi
> > primera incursión en GLM.
> > Estoy ajustando un modelo binomial a unos datos de germinación. El modelo
> > es muy sencillo, tengo un factor "Condicion" con dos niveles: "a" y "b"
> > (nivel de humedad en suelo). Por otro lado, tengo una variable
> explicativa
> > "HF" (horas frío=estratificación) que va de 0 a 2160 (en el modelo esta
> > variable es HFe por estandarizada y centrada).
> > Acá van mis preguntas:
> >   - ¿cómo interpreto la salida? Si, por ejemplo, me
> > encuentro bajo la "condicion a", ¿no incluyo los
> > términos +1.97820*Condicionb y +1.22376*Condicionb*HFe?
> >   - cuando tengo que reemplazar los valores HFe en la
> > fórmula, ¿tengo que utilizar los valores centrados y estandarizados o
> > puedo usar los valores "crudos" HF?  Pregunto esto porque perdería
> > reproducibilidad del modelo si utilizo los datos centrados
> >
> > Les dejo el summary de mi modelo y agradezco de antemano su colaboración.
> > Saludos
> >
> >
> > Call:
> > glm(formula = prop ~ Condicion + HFe + Condicion * HFe, family =
> binomial,
> >  data = datos)
> >
> > Deviance Residuals:
> >  Min   1Q   Median   3Q  Max
> > -4.0365  -1.2027   0.0994   0.9577   3.4023
> >
> > Coefficients:
> >Estimate Std. Errorz value
> >   Pr(>|z|)
> > (Intercept)  -1.177490.04484  -26.262   <
> 2e-16
> > ***
> > Condicionb 1.97820 0.06434  30.745< 2e-16
> > ***
> > HFe-0.206260.04503  -4.581
> >   4.64e-06 ***
> > Condicionb:HFe 1.22376 0.06667  18.355< 2e-16 ***
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 

[R] FW: Loop for two columns and 154 rows

2020-09-15 Thread PIKAL Petr
Sorry, forgot to copy to r help.

Petr
> -Original Message-
> From: PIKAL Petr
> Sent: Tuesday, September 15, 2020 11:53 AM
> To: 'Hesham A. AL-bukhaiti' 
> Subject: RE: [R] Loop for two columns and 154 rows
> 
> Hi
> 
> Your mail is unreadable, post in plain text not HTML.
> 
> If I deciphered it correcttly you want all values which have G1 in column 1 
> and
> G2 in column 2 or G2 in column 1 and G1 in column to produce 1 all other
> produce 0
> 
> So if your data frame is named truth
> 
> truth$column3 <- ((truth[,1] =="G1" & truth[,2] =="G2") | (truth[,2] =="G1" &
> truth[,1] =="G2")) * 1
> 
> Cheers
> Petr
> 
> > -Original Message-
> > From: R-help  On Behalf Of Hesham A. AL-
> > bukhaiti via R-help
> > Sent: Tuesday, September 15, 2020 11:01 AM
> > To: r-help@r-project.org
> > Subject: [R] Loop for two columns and 154 rows
> >
> >  Dears in R :i have this code in R:
> > # this for do not work true (i tried )out<-read.csv("outbr.csv") truth<-
> > out[,seq(1,2)]truth<-
> > cbind(as.character(truth[,1]),as.character(truth[,2])
> ,as.data.frame(rep(
> > 0,,dim(out)[1])));for (j in 1:2) {  for (i in 1:20) {truth[(truth[,1]== 
> > truth[j,i] &
> > truth[,2]== truth[j,i+1]) | (truth[,1]== truth[j+1,i] & truth[,2]==
> > truth[j+1,i+1]),3]<-1   } }
> > #truth<-out[,seq(1,2)]#truth<-
> > cbind(as.character(truth[,1]),as.character(truth[,2])  #
> ,as.data.frame(rep
> > (0,,dim(out)[1])));#truth[(truth[,1]=="G2" & truth[,2]=="G1") |
> (truth[,1]=="G1"
> > & truth[,2]=="G2"),3]<-1
> >
> #
> > #3
> >
> > I have file have two columns  . data in this file is text just (G1,G2,G3… to
> > G154). one element  can repeat, no problem ,so  we have 23562 rows in two
> > columns (for 154 elements) like :
> > Column1   column2 column3
> > G1 G40
> > G4 G60
> > G100   G7 1G7  G100. 1. 
> >  ..   .
> I
> > want to make third column (1 or 0) based on this condition:
> > IF  truth[,1]==”G1” & truth[,2]==”G2” | truth[,1]==”G2” & truth[,2]==”G1” <-
> > 1.then In the third column write 1 otherwise write 0.G1 and G2  just
> > exampl  (indeed  i want test If two each elements   has a reciprocal
> > relationship(G1 to G2 and G2 to G1or not) Best regHesham
> >
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Interpretación de salida de un GLM

2020-09-15 Thread Francisco Rodriguez Sanchez
Hola Juan,

Primero de todo, no estoy seguro de si tu variable respuesta 'prop' 
representa la proporción de semillas germinadas? No es lo mismo que esa 
proporción sea 1 de 2 que 50 de 100, aunque el porcentaje de germinación 
sea el mismo (50%). Ver 
https://stats.stackexchange.com/questions/241983/using-proportions-directly-instead-of-cbind-in-glm-binomial-regression-is-th.
 


Mi opción preferida para tener en cuenta el tamaño de muestra es 
proporcionar el número de semillas germinadas y el de no germinadas como 
variable respuesta, esto es

glm(cbind(germin, nogermin) ~ condicion etc, family = binomial)

En cuanto a la interpretación de los parámetros, en tu modelo el 
intercept representaría la probabilidad de germinación (en escala logit) 
cuando la condición es a y HFe es 0. Para interpretar un modelo así con 
interacciones creo que lo mejor es visualizarlo, por ejemplo usando el 
paquete effects o visreg. Aquí tengo algunos ejemplos 
(https://github.com/Pakillo/LM-GLM-GLMM-intro/blob/trees/glm_binomial.pdf), 
pero hay mucha más información en internet.

Para obtener los valores estimados de germinación en distintas 
condiciones de humedad y estratificación, creo que lo más fácil es usar 
la función predict. Le pasas un data frame con los valores de humedad y 
estratificación y obtienes la probabilidad de germinación, con su 
incertidumbre. Si usaste HFe como predictor, debes mantener la misma 
escala. Si te interesa que el modelo se aplique en el futuro a otros 
datos, mejor usar HF tal cual o usar puntos de referencia fijos (p. ej. 
HF = 1000 horas) como propone Carlos.

Espero que sirva. Suerte

Paco


On 14/9/20 21:44, Juan Seco Lopez wrote:
> Estimada comunidad, tengo unas dudas que son muy básicas creo, pero es mi
> primera incursión en GLM.
> Estoy ajustando un modelo binomial a unos datos de germinación. El modelo
> es muy sencillo, tengo un factor "Condicion" con dos niveles: "a" y "b"
> (nivel de humedad en suelo). Por otro lado, tengo una variable explicativa
> "HF" (horas frío=estratificación) que va de 0 a 2160 (en el modelo esta
> variable es HFe por estandarizada y centrada).
> Acá van mis preguntas:
>   - ¿cómo interpreto la salida? Si, por ejemplo, me
> encuentro bajo la "condicion a", ¿no incluyo los
> términos +1.97820*Condicionb y +1.22376*Condicionb*HFe?
>   - cuando tengo que reemplazar los valores HFe en la
> fórmula, ¿tengo que utilizar los valores centrados y estandarizados o
> puedo usar los valores "crudos" HF?  Pregunto esto porque perdería
> reproducibilidad del modelo si utilizo los datos centrados
>
> Les dejo el summary de mi modelo y agradezco de antemano su colaboración.
> Saludos
>
>
> Call:
> glm(formula = prop ~ Condicion + HFe + Condicion * HFe, family = binomial,
>  data = datos)
>
> Deviance Residuals:
>  Min   1Q   Median   3Q  Max
> -4.0365  -1.2027   0.0994   0.9577   3.4023
>
> Coefficients:
>Estimate Std. Errorz value
>   Pr(>|z|)
> (Intercept)  -1.177490.04484  -26.262   < 2e-16
> ***
> Condicionb 1.97820 0.06434  30.745< 2e-16
> ***
> HFe-0.206260.04503  -4.581
>   4.64e-06 ***
> Condicionb:HFe 1.22376 0.06667  18.355< 2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
>  Null deviance: 1923.94  on 139  degrees of freedom
> Residual deviance:  348.59  on 136  degrees of freedom
> AIC: 875.09
>
> Number of Fisher Scoring iterations: 4
>
>   [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es

-- 
Dr Francisco Rodríguez-Sánchez
https://frodriguezsanchez.net


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R] Loop for two columns and 154 rows

2020-09-15 Thread Hesham A. AL-bukhaiti via R-help
 Dears in R :i have this code in R:
# this for do not work true (i tried )out<-read.csv("outbr.csv") 
truth<-out[,seq(1,2)]truth<-cbind(as.character(truth[,1]),as.character(truth[,2])
             ,as.data.frame(rep(0,,dim(out)[1])));for (j in 1:2) {  for (i in 
1:20) {    truth[(truth[,1]== truth[j,i] & truth[,2]== truth[j,i+1]) |         
(truth[,1]== truth[j+1,i] & truth[,2]== truth[j+1,i+1]),3]<-1   }
}
#truth<-out[,seq(1,2)]#truth<-cbind(as.character(truth[,1]),as.character(truth[,2])
  #           ,as.data.frame(rep(0,,dim(out)[1])));#truth[(truth[,1]=="G2" & 
truth[,2]=="G1") | (truth[,1]=="G1" & truth[,2]=="G2"),3]<-1 
##3

I have file have two columns  . data in this file is text just (G1,G2,G3… to 
G154). one element  can repeat, no problem ,so  we have 23562 rows in two 
columns (for 154 elements) like :
Column1   column2     column3  
G1                 G4            0
G4                 G6            0
G100           G7             1G7              G100    .     1.                 
     ..                       . I want to make third column (1 or 0) based on 
this condition:
IF  truth[,1]==”G1” & truth[,2]==”G2” | truth[,1]==”G2” & truth[,2]==”G1” 
<-1.then In the third column write 1 otherwise write 0.G1 and G2  just exampl  
(indeed  i want test If two each elements   has a reciprocal relationship(G1 to 
G2 and G2 to G1or not)
Best regHesham 




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question including crossover trials in meta-analysis

2020-09-15 Thread Michael Dewey

Dear Vera

In addition to what you already have you might like to know about the 
mailing list specifically dedicated to meta-analysis in R.


https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis//

You might like to search the archives first as this sort of issue does 
come up there. You are also likely to get more immediate help if you 
frame your question in terms of either meta or metafor the two packages 
which dmetar uses. The authors of meta and metafor all frequent the list


Michael

On 14/09/2020 20:41, Marc Schwartz via R-help wrote:

Hi,

Bert has pointed you to some R specific packages for meta-analyses via the Task 
View.

It sounds like you may need to first address some underlying conceptual issues, 
which strictly speaking, is off-topic for this list.

That being said, a quick Google search came up with some possible resources, 
beyond the Cochrane reference:

   https://academic.oup.com/ije/article/31/1/140/655940

   https://onlinelibrary.wiley.com/doi/abs/10.1002/jrsm.1236

   https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0133023


Perhaps with additional conceptual background, that might assist in enabling 
you to make use of the R packages that provide relevant functionality.

Another option, if you want to stay with the dmetar package, would be to 
contact the package maintainer for some guidance relative to how to use the 
functionality in the package given your specific use case.

Regards,

Marc Schwartz



On Sep 14, 2020, at 3:14 PM, Bert Gunter  wrote:

Did you first try a web search? -- you should always do this before posting
here.

"meta-analysis in R" brought up this:

https://CRAN.R-project.org/view=MetaAnalysis

Have you looked at this task view yet?


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Sep 14, 2020 at 11:50 AM Belgers, V. (Vera) <
v.belg...@amsterdamumc.nl> wrote:


Dear sir/madam,
Thank you in advance for taking the time to read my question. I am
currently trying to conduct a meta-analysis combining parallel and
crossover trials. According to the Cochrane Handbook, I can include
crossover trials by using t-paired statistics. So far, I have managed to
conduct a meta-analysis and forest plot of the parallel trials using the
dmetar package, but I did not succeed in including the crossover trials. I
do have the raw data of most of these crossover trials.
Does anybody know how to add crossover trials to the meta-analysis?
With kind regards,
Vera Belgers


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread David Winsemius
There is a user-group for PLINK, easily found by looking at the page you 
cited. This is not the correct place to submit such questions.



https://groups.google.com/g/plink2-users?pli=1


--

David.

On 9/14/20 6:29 AM, Ana Marija wrote:

Hello,

I was running association analysis using --glm genotypic from:
https://www.cog-genomics.org/plink/2.0/assoc with these covariates:
sex,age,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10,TD,array,HBA1C. The
result looks like this:

 #CHROMPOSIDREFALTA1TESTOBS_CTBETA
   SEZ_OR_F_STATPERRCODE
 10135434303rs11101905GAAADD11863
-0.1107330.0986981-1.121930.261891.
 10135434303rs11101905GAADOMDEV11863
0.0797970.1110040.7188680.47.
 10135434303rs11101905GAAsex=Female
11863-0.1204040.0536069-2.246050.0247006.
 10135434303rs11101905GAAage11863
0.005245010.003915281.339630.180367.
 10135434303rs11101905GAAPC111863
-0.01917790.0166868-1.149280.25044.
 10135434303rs11101905GAAPC211863
-0.02699390.0173086-1.559570.118863.
 10135434303rs11101905GAAPC311863
0.01152070.01680760.6854480.493061.
 10135434303rs11101905GAAPC411863
9.57832e-050.01246070.00768680.993867.
 10135434303rs11101905GAAPC511863
-0.001910470.00543937-0.351230.725416.
 10135434303rs11101905GAAPC611863
-0.01033090.0159879-0.6461720.518168.
 10135434303rs11101905GAAPC711863
0.007909970.01440250.5492070.582863.
 10135434303rs11101905GAAPC811863
-0.002056390.0142709-0.1440960.885424.
 10135434303rs11101905GAAPC911863
-0.008737710.0057239-1.526530.126878.
 10135434303rs11101905GAAPC1011863
0.01161970.01238260.9383880.348045.
 10135434303rs11101905GAATD11863
-0.6700260.0962216-6.963373.32228e-12.
 10135434303rs11101905GAAarray=Biobank
118630.1606660.0736312.182050.0291062.
 10135434303rs11101905GAAHBA1C11863
0.02659330.0016875815.75836.0236e-56.
 10135434303rs11101905GAAGENO_2DF11863
   NANA0.7265140.483613.

This results is shown just for one ID (rs11101905) there is about 2
million of those in the resulting file.

My question is how do I present/plot the effect of covariate "TD" in
the example it has "P" equal to 3.32228e-12 for all IDs in the
resulting file so that I show how much effect covariate "TD" has on
the analysis. Should I run another regression without covariate "TD"
and than do scatter plot of P values with and without "TD" covariate
or there is a better way to do this from the data I already have?

Thanks
Ana

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.