Re: [R] Seeking Assistance: Plotting Sea Current Vectors in R

2023-07-25 Thread PIKAL Petr
Not sure if correct but Chatgpt answer is:

Here are some popular R packages for sea current vectors:

oce: The "oce" package provides a wide range of functions for oceanographic 
data analysis, including handling sea current data. It allows you to work with 
current data in various formats and provides functions for visualization and 
basic analysis.

oceanmap: The "oceanmap" package is specifically designed for the analysis and 
visualization of oceanographic data. It includes functions for mapping 
oceanographic variables, including sea current vectors, using ggplot2-based 
plotting techniques.

rOceans: This package is part of the "rOpenSci" project and focuses on 
accessing various oceanographic datasets, including sea current data. It 
provides functions to download and work with sea current data from online 
sources.

marmap: While "marmap" is primarily designed for bathymetric data (depth 
measurements of the ocean), it can be useful for visualizing sea current 
vectors in the context of oceanographic maps.

ncdf4: The "ncdf4" package allows you to work with netCDF files, which are 
commonly used to store oceanographic data, including sea current data. It 
enables you to read, write, and manipulate netCDF files in R.

oceanoGrafia: The "oceanoGrafia" package provides tools for oceanographic data 
analysis, including sea current vectors. It supports various data formats and 
offers functions for data processing and visualization.

ocean: The "ocean" package is designed for oceanographic data analysis and 
visualization, including sea current data. It offers functions for data 
manipulation, transformation, and plotting.

Cheers Petr


-Original Message-
From: R-help  On Behalf Of konstantinos 
christodoulou
Sent: Tuesday, July 25, 2023 1:29 PM
To: r-help mailing list 
Subject: [R] Seeking Assistance: Plotting Sea Current Vectors in R

Dear Rcommunity,

I hope this email finds you well. I am writing to seek your assistance with a 
data visualization problem I am facing while working with R.

Problem Description:

I have a dataframe named "df" containing the following columns:
"longitude", "latitude", "sea_currents_mag", and "sea_currents_direction".
The dataframe includes sea current estimations, with information about 
magnitude (m/s) and direction (degrees) at various longitude and latitude 
coordinates. The study domain covers the Eastern Mediterranean Sea (23E to
36W) and extends from 31S to 37N. It is important to note that the longitude 
and latitude coordinates are not evenly spaced across the domain.

Objective: I am seeking guidance on how to create a plot that visualizes the 
sea current vectors (arrows) at each coordinate. Additionally, if possible, I 
would like the borders of the surrounding countries to be included in the plot 
to provide geographic context.

Specific Requests:

   1. Help with plotting sea current vectors (arrows) based on
   "sea_currents_mag" and "sea_currents_direction" at each corresponding
   longitude and latitude coordinate.
   2. Assistance with including the borders of the surrounding countries in
   the plot to provide geographic context.

I would highly appreciate any advice, code examples, or packages that could 
assist me in achieving this visualization goal.

Thank you very much for taking the time to read my email, and I look forward to 
any assistance you can provide.

Best regards,
Kostas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting factors in graph panel

2023-07-07 Thread PIKAL Petr
Hallo Anupam

With

ggplot change axis label size into Google

the first answer I got was

axis.text theme

r - Change size of axes title and labels in ggplot2 - Stack Overflow 
<https://stackoverflow.com/questions/14942681/change-size-of-axes-title-and-labels-in-ggplot2>
 

 

so

 

ggplot(TrialData4, aes(x=Income, y=Percent, group=Measure)) + geom_point() +
  geom_line() + facet_wrap(~Measure) + theme(axis.text=element_text(size=5))

 

Should do the trick.

 

S pozdravem | Best Regards

RNDr. Petr PIKAL
Vedoucí Výzkumu a vývoje | Research Manager

PRECHEZA a.s.
nábř. Dr. Edvarda Beneše 1170/24 | 750 02 Přerov | Czech Republic
Tel: +420 581 252 256 | GSM: +420 724 008 364
 <mailto:petr.pi...@precheza.cz> petr.pi...@precheza.cz |  
<https://www.precheza.cz/> www.precheza.cz

Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na:  
<https://www.precheza.cz/zasady-ochrany-osobnich-udaju/> 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website:  <https://www.precheza.cz/en/personal-data-protection-principles/> 
https://www.precheza.cz/en/personal-data-protection-principles/

Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohlášení o vyloučení odpovědnosti:  
<https://www.precheza.cz/01-dovetek/> https://www.precheza.cz/01-dovetek/ | 
This email and any documents attached to it may be confidential and are subject 
to the legally binding disclaimer:  <https://www.precheza.cz/en/01-disclaimer/> 
https://www.precheza.cz/en/01-disclaimer/

 

From: Anupam Tyagi  
Sent: Friday, July 7, 2023 12:48 PM
To: PIKAL Petr 
Cc: r-help@r-project.org
Subject: Re: [R] Plotting factors in graph panel

 

Thanks! You are correct, the graphs look very similar, except ggplot is scaling 
the text font to make it more readable. Is there a way to scale down the x-axis 
labels, so they are readable?

 

On Fri, 7 Jul 2023 at 12:02, PIKAL Petr mailto:petr.pi...@precheza.cz> > wrote:

Hallo Anupam

I do not see much difference in ggplot or lattice, they seems to me provide 
almost identical results when removing theme part from ggplot.

library(ggplot2)
library(lattice)

ggplot(TrialData4, aes(x=Income, y=Percent, group=Measure)) + geom_point() +
  geom_line() + facet_wrap(~Measure)

xyplot(Percent ~ Income | Measure, TrialData4,
   type = "o", pch = 16, as.table = TRUE, grid = TRUE)

So it is probably only matter of your preference which one do you choose.

Cheers
Petr


> -Original Message-
> From: R-help  <mailto:r-help-boun...@r-project.org> > On Behalf Of Deepayan Sarkar
> Sent: Thursday, July 6, 2023 3:06 PM
> To: Anupam Tyagi mailto:anupty...@gmail.com> >
> Cc: r-help@r-project.org <mailto:r-help@r-project.org> 
> Subject: Re: [R] Plotting factors in graph panel
> 
> On Thu, 6 Jul 2023 at 15:21, Anupam Tyagi  <mailto:anupty...@gmail.com> > wrote:
> >
> > Btw, I think "lattice" graphics will provide a better solution than
> > "ggplot", because it puts appropriate (space saving) markers on the
> > axes and does axes labels well. However, I cannot figure out how to do
> > it in "lattice".
> 
> You will need to convert Income to a factor first. Alternatively, use
> dotplot() instead of xyplot(), but that will sort the levels wrongly, so 
> better to
> make the factor first anyway.
> 
> TrialData4 <- within(TrialData4,
> {
> Income <- factor(Income, levels = c("$10", "$25", "$40", "$75", "> $75"))
> })
> 
> xyplot(Percent ~ Income | Measure, TrialData4,
>type = "o", pch = 16, as.table = TRUE, grid = TRUE)
> 
> or
> 
> dotplot(Percent ~ Income | Measure, TrialData4,
> type = "o", as.table = TRUE)
> 
> This is not really any different from the ggplot() version though.
> Maybe you just don't like the effect of the '+ theme_classic()' part.
> 
> Best,
> -Deepayan
> 
> 
> > On Thu, 6 Jul 2023 at 15:11, Anupam Tyagi  > <mailto:anupty...@gmail.com> > wrote:
> >
> > > Hi John:
> > >
> > > Thanks! Below is the data using your suggestion. I used "ggplot" to
> > > make a graph. I am not too happy with it. I am looking for something
> > > simpler and cleaner. Plot is attached.
> > >
> > > I also tried "lattice" package, but nothing got plotted with "xyplot"
> > > command, because it is looking for a numeric variable on x-axis.
> > >
> > > ggplot(TrialData4, aes(x=Income, y=Percent, group=Measure)) +
>

Re: [R] Plotting factors in graph panel

2023-07-07 Thread PIKAL Petr
Hallo Anupam

I do not see much difference in ggplot or lattice, they seems to me provide 
almost identical results when removing theme part from ggplot.

library(ggplot2)
library(lattice)

ggplot(TrialData4, aes(x=Income, y=Percent, group=Measure)) + geom_point() +
  geom_line() + facet_wrap(~Measure)

xyplot(Percent ~ Income | Measure, TrialData4,
   type = "o", pch = 16, as.table = TRUE, grid = TRUE)

So it is probably only matter of your preference which one do you choose.

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Deepayan Sarkar
> Sent: Thursday, July 6, 2023 3:06 PM
> To: Anupam Tyagi 
> Cc: r-help@r-project.org
> Subject: Re: [R] Plotting factors in graph panel
> 
> On Thu, 6 Jul 2023 at 15:21, Anupam Tyagi  wrote:
> >
> > Btw, I think "lattice" graphics will provide a better solution than
> > "ggplot", because it puts appropriate (space saving) markers on the
> > axes and does axes labels well. However, I cannot figure out how to do
> > it in "lattice".
> 
> You will need to convert Income to a factor first. Alternatively, use
> dotplot() instead of xyplot(), but that will sort the levels wrongly, so 
> better to
> make the factor first anyway.
> 
> TrialData4 <- within(TrialData4,
> {
> Income <- factor(Income, levels = c("$10", "$25", "$40", "$75", "> $75"))
> })
> 
> xyplot(Percent ~ Income | Measure, TrialData4,
>type = "o", pch = 16, as.table = TRUE, grid = TRUE)
> 
> or
> 
> dotplot(Percent ~ Income | Measure, TrialData4,
> type = "o", as.table = TRUE)
> 
> This is not really any different from the ggplot() version though.
> Maybe you just don't like the effect of the '+ theme_classic()' part.
> 
> Best,
> -Deepayan
> 
> 
> > On Thu, 6 Jul 2023 at 15:11, Anupam Tyagi  wrote:
> >
> > > Hi John:
> > >
> > > Thanks! Below is the data using your suggestion. I used "ggplot" to
> > > make a graph. I am not too happy with it. I am looking for something
> > > simpler and cleaner. Plot is attached.
> > >
> > > I also tried "lattice" package, but nothing got plotted with "xyplot"
> > > command, because it is looking for a numeric variable on x-axis.
> > >
> > > ggplot(TrialData4, aes(x=Income, y=Percent, group=Measure)) +
> > > geom_point()
> > > +
> > >   geom_line() + facet_wrap(~Measure) + theme_classic()
> > >
> > > > dput(TrialData4)structure(list(Income = c("$10", "$25", "$40",
> > > > "$75", "> $75",
> > > "$10", "$25", "$40", "$75", "> $75", "$10", "$25", "$40", "$75", ">
> > > $75", "$10", "$25", "$40", "$75", "> $75", "$10", "$25", "$40",
> > > "$75", "> $75", "$10", "$25", "$40", "$75", "> $75", "$10", "$25",
> > > "$40", "$75", "> $75", "$10", "$25", "$40", "$75", "> $75", "$10",
> > > "$25", "$40", "$75", "> $75", "$10", "$25", "$40", "$75", "> $75",
> > > "$10", "$25", "$40", "$75", "> $75", "$10", "$25", "$40", "$75", ">
> > > $75", "$10", "$25", "$40", "$75", "> $75", "$10", "$25", "$40",
> > > "$75", "> $75", "$10", "$25", "$40", "$75", "> $75", "$10", "$25",
> > > "$40", "$75", "> $75", "$10", "$25", "$40", "$75", "> $75", "$10",
> > > "$25", "$40", "$75", "> $75", "$10", "$25", "$40", "$75", "> $75",
> > > "$10", "$25", "$40", "$75", "> $75", "$10", "$25", "$40", "$75", ">
> > > $75", "$10", "$25", "$40", "$75", "> $75", "$10", "$25", "$40",
> > > "$75", "> $75", "$10", "$25", "$40", "$75", "> $75", "$10", "$25",
> > > "$40", "$75", "> $75", "$10", "$25", "$40", "$75", "> $75", "$10",
> > > "$25", "$40", "$75", "> $75", "$10", "$25", "$40", "$75", "> $75"
> > > ), Percent = c(3.052, 2.292, 2.244, 1.706, 1.297, 29.76, 28.79,
> > > 29.51, 28.9, 31.67, 31.18, 32.64, 34.31, 35.65, 37.59, 36, 36.27,
> > > 33.94, 33.74, 29.44, 46.54, 54.01, 59.1, 62.17, 67.67, 24.75, 24.4,
> > > 25, 24.61, 24.02, 25.4, 18.7, 29, 11.48, 7.103, 3.052, 2.292, 2.244,
> > > 1.706, 1.297, 29.76, 28.79, 29.51, 28.9, 31.67, 31.18, 32.64, 34.31,
> > > 35.65, 37.59, 36, 36.27, 33.94, 33.74, 29.44, 46.54, 54.01, 59.1,
> > > 62.17, 67.67, 24.75, 24.4, 25, 24.61, 24.02, 25.4, 18.7, 29, 11.48,
> > > 7.103, 3.052, 2.292, 2.244, 1.706, 1.297, 29.76, 28.79, 29.51, 28.9,
> > > 31.67, 31.18, 32.64, 34.31, 35.65, 37.59, 36, 36.27, 33.94, 33.74,
> > > 29.44, 46.54, 54.01, 59.1, 62.17, 67.67, 24.75, 24.4, 25, 24.61,
> > > 24.02, 25.4, 18.7, 29, 11.48, 7.103, 3.052, 2.292, 2.244, 1.706,
> > > 1.297, 29.76, 28.79, 29.51, 28.9, 31.67, 31.18, 32.64, 34.31, 35.65,
> > > 37.59, 36, 36.27, 33.94, 33.74, 29.44, 46.54, 54.01, 59.1, 62.17,
> > > 67.67, 24.75, 24.4, 25, 24.61, 24.02, 25.4, 18.7, 29, 11.48, 7.103),
> > > Measure = c("MF None", "MF None", "MF None", "MF None", "MF None",
> > > "MF Equity", "MF Equity", "MF Equity", "MF Equity", "MF Equity", "MF
> > > Debt", "MF Debt", "MF Debt", "MF Debt", "MF Debt", "MF Hybrid", "MF
> > > Hybrid", "MF Hybrid", "MF Hybrid", "MF Hybrid", "Bank None", "Bank
> > > None", "Bank None", "Bank None", "Bank None", "Bank Current", "Bank
> > > Current", "Bank Current", "Bank Current", "Bank Current", "Bank
> > > Savings", "Bank Savings", "Bank 

Re: [R] Plotting factors in graph panel

2023-07-03 Thread PIKAL Petr
Hi

I believe that facet_grid his is quite close to what you expect.  

p <- ggplot(mpg, aes(displ, cty)) + geom_point()+geom_line()
p + facet_grid(vars(drv), vars(cyl))

You can inspect how mpg data is organized by head(mpg)

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Anupam Tyagi
> Sent: Monday, July 3, 2023 11:54 AM
> To: Jim Lemon 
> Cc: r-help mailing list 
> Subject: Re: [R] Plotting factors in graph panel
> 
> Attached is another example plot, that is better than the earlier one.
> 
> On Mon, 3 Jul 2023 at 15:21, Anupam Tyagi  wrote:
> 
> > I thought maybe I can share with you how the data looks in Excel, and
> > an example plot I found on the web that looks similar to what I want to 
> > plot.
> > These are attached to this email as *.png files. I am trying to see
> > (plot) how each row of data (percentages) varies with income, making
> > many small graphs in the same plot. For each row of data there will be
> > one graph. I can manually delete the "No Answer" rows in Excel, if
> > that is the best solution. I want the output to look like the
> > sparklines in column "I" of attached Excel screenshot, with labelling
> > of each graphs. This is similar to the attached "Example_plot". I
> > thought this could be done with Lattice, or base-R, or ggplot easily,
> > but this turning out to be more difficult than I had thought.
> >
> > On Mon, 3 Jul 2023 at 14:38, Anupam Tyagi  wrote:
> >
> >> Thanks Jim, thanks everyone. I was caught up with work and moving
> >> home, so a delay in response. I tried running the code you provided
> >> and it is not running well in my R-Studio setup. It is giving errors
> >> and not producing plots. I don't yet understand all the code well
> >> yet, so I need to work on it and then get back to you all. Sorry for
> >> not posting data from a R dataframe. My data is still in Excel. I
> >> organized data in Excel almost exactly (look wise) as the output from
> >> Stata log file (text) for a "tabulate" command for a survey dataset.
> >> I don't yet understand a good way to organize this data in R, so I
> >> cannot send it to you now. Let me do some work on this, understand
> >> the R code you have given, and get back to you in a few days. I have
> >> not been using R lately, but I think the graph I am trying to make
> >> will be done better and easier in R than in Stata. Thank you all for all 
> >> your
> help. Let me do some work and get back to you.
> >>
> >>
> >> On Fri, 30 Jun 2023 at 04:41, Jim Lemon  wrote:
> >>
> >>> Okay. Here is a modification that does four single line plots.
> >>>
> >>> at_df<-read.table(text=
> >>>  "Income MF MF_None MF_Equity MF_Debt MF_Hybrid Bank_None
> >>> Bank_Current Bank_Savings Bank_NA
> >>>  $10 1 3.05 29.76 31.18 36.0 46.54 24.75 25.4 3.307
> >>>  $25 2 2.29 28.79 32.64 36.27 54.01 24.4 18.7 2.891
> >>>  $40 3 2.24 29.51 34.31 33.94 59.1 25.0 29 13.4
> >>>  $75 4 1.71 28.90 35.65 33.74 62.17 24.61 11.48 1.746
> >>>  >$75 5 1.30 31.67 37.59 29.44 67.67 24.02 7.103 1.208  No_Answer 9
> >>> 2.83 36.77 33.15 27.25 60.87 21.09 13.46 4.577",
> >>>  header=TRUE,stringsAsFactors=FALSE)
> >>> at_df<-
> at_df[at_df$Income!="No_Answer",which(names(at_df)!="Bank_NA"
> >>> )]
> >>> png("Income_pcts.png",height=700)
> >>> par(mfrow=c(4,1))
> >>> plot(at_df[,"Bank_Current"],
> >>>  type="l",lwd=3,main="Bsnk_Current",
> >>>  xlab="Income",ylab="%",xaxt="n")
> >>> axis(1,at=1:5,labels=at_df$Income)
> >>> plot(at_df[,"Bank_Savings"],
> >>>  type="l",lwd=3,main="Bank_Sasvings",
> >>>  xlab="Income",ylab="%",xaxt="n")
> >>> axis(1,at=1:5,labels=at_df$Income)
> >>> plot(at_df[,"MF_Equity"],
> >>>  type="l",lwd=3,main="MF_Equity",
> >>>  xlab="Income",ylab="%",xaxt="n")
> >>> axis(1,at=1:5,labels=at_df$Income)
> >>> plot(at_df[,"MF_Debt"],
> >>>  type="l",lwd=3,main="MF_Debt",
> >>>  xlab="Income",ylab="%",xaxt="n")
> >>> axis(1,at=1:5,labels=at_df$Income)
> >>> dev.off()
> >>>
> >>> Jim
> >>>
> >>> On Thu, Jun 29, 2023 at 1:49 PM Anupam Tyagi 
> >>> wrote:
> >>> >
> >>> > Thanks, Pikal and Jim. Yes, it has been a long time Jim. I hope
> >>> > you
> >>> have
> >>> > been well.
> >>> >
> >>> > Pikal, thanks. Your solution may be close to what I want. I did
> >>> > not
> >>> know
> >>> > that I was posting in HTML. I just copied the data from Excel and
> >>> posted in
> >>> > the email in Gmail. The data is still in Excel, because I have not
> >>> > yet figured out what is a good way to organize it in R. I am
> >>> > posting it
> >>> again
> >>> > below as text. These are rows in Excel: 1,2,3,5,9 after MF are
> >>> > income categories and No Answer category (9). Down the second
> >>> > column are categories of MF and Bank AC. Rest of the columns are
> percentages.
> >>> >
> >>> > Jim, thanks for the graph. I am looking to plot only one line
> >>> (category)
> >>> > each in many small plots on the same page. I don't want to compare
> >>> > different categories on the same graph as you do, but see how each
> >>> category
> >>> > 

Re: [R] Help/documentation on Rgui

2023-07-03 Thread PIKAL Petr
Hi

I am not sure about opening Rgui in terminal but for customising Rgui
appearance you can modify Rconsole and Rprofile or Rprofile.site which you
should find in etc folder of your R installation.

https://stat.ethz.ch/R-manual/R-devel/library/utils/html/Rconsole.html
https://rdrr.io/r/utils/Rconsole.html
and "Initialization at Start of an R Session" in R help
?Rprofile

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Iago Giné
> Vázquez
> Sent: Monday, July 3, 2023 8:36 AM
> To: r-help@r-project.org
> Subject: [R] Help/documentation on Rgui
> 
> Hi all,
> 
> Where can I find a detailed document(ation) on the use of Rgui.exe. The
most
> detailed I found is https://cran.r-project.org/doc/manuals/r-release/R-
> ints.html#GUI-consoles, where there is almost nothing.
> 
> Actually I want to know how to open Rgui.exe (let's say, from a terminal
> [mainly in Windows], even better, through the ViM plugin NVim-R) with a
set
> of specific preferences, like a dark background or specific text colour
and size,
> which I see I can modify once it is open.
> 
> Thank you for your help.
> 
> Iago
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting factors in graph panel

2023-06-29 Thread PIKAL Petr
Hi Anupam

Using Jim's data

library(reshape2)
at_long <- melt(at_df)
at_long$innum <- as.numeric(as.factor(at_long$Income))
ggplot(at_long, aes(x=innum, y=value)) + geom_path() + facet_wrap(~variable, 
ncol=1)

is probably close to what you want. 

You need to fiddle with labels, facets variable names to fit your needs.
Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Anupam Tyagi
> Sent: Thursday, June 29, 2023 5:49 AM
> To: r-help@r-project.org
> Subject: Re: [R] Plotting factors in graph panel
> 
> Thanks, Pikal and Jim. Yes, it has been a long time Jim. I hope you have been
> well.
> 
> Pikal, thanks. Your solution may be close to what I want. I did not know that 
> I
> was posting in HTML. I just copied the data from Excel and posted in the email
> in Gmail. The data is still in Excel, because I have not yet figured out what 
> is a
> good way to organize it in R. I am posting it again below as text. These are
> rows in Excel: 1,2,3,5,9 after MF are income categories and No Answer
> category (9). Down the second column are categories of MF and Bank AC.
> Rest of the columns are percentages.
> 
> Jim, thanks for the graph. I am looking to plot only one line (category) each 
> in
> many small plots on the same page. I don't want to compare different
> categories on the same graph as you do, but see how each category varies by
> income, one category in each graph. Like Excel does with Sparklines (Top
> menu: Insert, Sparklines, Lines). I have many categories for many variables. I
> am only showing two MF and Bank AC.
> 
> Income $10 $25 $40 $75 > $75 No Answer
> MF 1 2 3 4 5 9
> None 1 3.05 2.29 2.24 1.71 1.30 2.83
> Equity 2 29.76 28.79 29.51 28.90 31.67 36.77 Debt 3 31.18 32.64 34.31 35.65
> 37.59 33.15 Hybrid 4 36.00 36.27 33.94 33.74 29.44 27.25 Bank AC None 1
> 46.54 54.01 59.1 62.17 67.67 60.87 Current 2 24.75 24.4 25 24.61 24.02 21.09
> Savings 3 25.4 18.7 29 11.48 7.103 13.46 No Answer 9 3.307 2.891 13.4 1.746
> 1.208 4.577
> 
> 
> On Wed, 28 Jun 2023 at 17:30, Jim Lemon  wrote:
> 
> > Hi Anupam,
> > Haven't heard from you in a long time. Perhaps you want something like
> > this:
> >
> > at_df<-read.table(text=
> >  "Income MF MF_None MF_Equity MF_Debt MF_Hybrid Bank_None
> Bank_Current
> > Bank_Savings Bank_NA
> >  $10 1 3.05 29.76 31.18 36.0 46.54 24.75 25.4 3.307
> >  $25 2 2.29 28.79 32.64 36.27 54.01 24.4 18.7 2.891
> >  $40 3 2.24 29.51 34.31 33.94 59.1 25.0 29 13.4
> >  $75 4 1.71 28.90 35.65 33.74 62.17 24.61 11.48 1.746
> >  >$75 5 1.30 31.67 37.59 29.44 67.67 24.02 7.103 1.208  No_Answer 9
> > 2.83 36.77 33.15 27.25 60.87 21.09 13.46 4.577",
> >  header=TRUE,stringsAsFactors=FALSE)
> > at_df<-
> at_df[at_df$Income!="No_Answer",which(names(at_df)!="Bank_NA")]
> > png("MF_Bank.png",height=600)
> > par(mfrow=c(2,1))
> > matplot(at_df[,c("MF_None","MF_Equity","MF_Debt","MF_Hybrid")],
> >  type="l",col=1:4,lty=1:4,lwd=3,
> >  main="Percentages by Income and MF type",
> > xlab="Income",ylab="Percentage of group",xaxt="n")
> > axis(1,at=1:5,labels=at_df$Income)
> > legend(3,24,c("MF_None","MF_Equity","MF_Debt","MF_Hybrid"),
> >  lty=1:4,lwd=3,col=1:4)
> > matplot(at_df[,c("Bank_None","Bank_Current","Bank_Savings")],
> >  type="l",col=1:3,lty=1:4,lwd=3,
> >  main="Percentages by Income and Bank type",
> > xlab="Income",ylab="Percentage of group",xaxt="n")
> > axis(1,at=1:5,labels=at_df$Income)
> > legend(3,54,c("Bank_None","Bank_Current","Bank_Savings"),
> >  lty=1:4,lwd=3,col=1:3)
> > dev.off()
> >
> > Jim
> >
> > On Wed, Jun 28, 2023 at 6:33 PM Anupam Tyagi 
> wrote:
> > >
> > > Hello,
> > >
> > > I want to plot the following kind of data (percentage of respondents
> > from a
> > > survey) that varies by Income into many small *line* graphs in a
> > > panel of graphs. I want to omit "No Answer" categories. I want to
> > > see how each one of the categories (percentages), "None", " Equity",
> > > etc. varies by
> > Income.
> > > How can I do this? How to organize the data well and how to plot? I
> > thought
> > > Lattice may be a good package to plot this, but I don't know for
> > > sure. I prefer to do this in Base-R if possible, but I am open to
> > > ggplot. Any
> > ideas
> > > will be helpful.
> > >
> > > Income
> > > $10 $25 $40 $75 > $75 No Answer
> > > MF 1 2 3 4 5 9
> > > None 1 3.05 2.29 2.24 1.71 1.30 2.83 Equity 2 29.76 28.79 29.51
> > > 28.90 31.67 36.77 Debt 3 31.18 32.64 34.31 35.65 37.59 33.15 Hybrid
> > > 4 36.00 36.27 33.94 33.74 29.44 27.25 Bank AC None 1 46.54 54.01
> > > 59.1 62.17 67.67 60.87 Current 2 24.75 24.4 25 24.61 24.02 21.09
> > > Savings 3 25.4 18.7 29 11.48 7.103 13.46 No Answer 9 3.307 2.891
> > > 13.4 1.746 1.208 4.577
> > >
> > > Thanks.
> > > --
> > > Anupam.
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> 

Re: [R] horizontal grouped stacked plots and removing space between bars

2023-06-29 Thread PIKAL Petr
Hi Anamaria

Thanks for the data. 

See in line.

> -Original Message-
> From: R-help  On Behalf Of Ana Marija
> Sent: Wednesday, June 28, 2023 6:00 PM
> To: r-help 
> Subject: [R] horizontal grouped stacked plots and removing space between
> bars
> 
> I have code like this:
> 
> data <- read.csv("test1.csv", stringsAsFactors=FALSE, header=TRUE)
> # Graph
> myplot=ggplot(data, aes(fill=condition, y=value, x=condition)) +
> geom_bar(position="dodge", stat="identity", width=0.5) +
> scale_fill_manual(values=c("#7b3294", "#c2a5cf", "#a6dba0",
"#008837"))+

You have only 4 colours, you need 5 so I removed the line

> ylab("Performance (ns/day)") +
> facet_wrap(~specie,nrow=3, labeller = label_wrap_gen(width = 85),
> strip.position="bottom") +
>theme_bw() +
>theme(panel.grid = element_blank(),
> panel.spacing = unit(0, "mm"),
> legend.title=element_blank(),
> axis.title.x = element_blank(),
> panel.grid.major = element_blank(),
> panel.grid.minor = element_blank(),
> panel.background = element_blank(),
> axis.ticks.x=element_blank(),
> axis.title.y = element_text(angle = 90, hjust = 0.5))
> 
> 
> 
> myplot + theme(panel.grid.major = element_blank(), panel.grid.minor =
> element_blank(), legend.title=element_blank(),
> panel.background = element_blank(), axis.title.x = element_blank(),
> axis.text.x=element_blank(), axis.ticks.x=element_blank(),
> axis.title.y = element_text(angle = 90, hjust = 0.5))

And here I got 2 plots one above the other so I presume this is what you
wanted. 

Spacing between bars
https://www.statology.org/ggplot2-space-between-bars/

Cheers
Petr

> 
> And my data is this:
> 
> > dput(data)
> structure(list(specie = c("gmx mdrun -gpu_id 1 -ntomp 16 -s
> benchMEM.tpr -nsteps 1", "gmx mdrun -gpu_id 1 -ntomp 16 -s
> benchMEM.tpr -nsteps 1", "gmx mdrun -gpu_id 1 -ntomp 16 -s
> benchMEM.tpr -nsteps 1", "gmx mdrun -gpu_id 1 -ntomp 16 -s
> benchMEM.tpr -nsteps 1", "gmx mdrun -gpu_id 1 -ntomp 16 -s
> MD_15NM_WATER.tpr -nsteps 1", "gmx mdrun -gpu_id 1 -ntomp 16 -s
> MD_15NM_WATER.tpr -nsteps 1", "gmx mdrun -gpu_id 1 -ntomp 16 -s
> MD_15NM_WATER.tpr -nsteps 1", "gmx mdrun -gpu_id 1 -ntomp 16 -s
> MD_15NM_WATER.tpr -nsteps 1"), condition = c("Tesla P100-SYCL",
> "Tesla V100-SYCL", "Tesla P100-CUDA", "Tesla V100-CUDA", "Tesla
> P100-SYCL", "Tesla V100-SYCL", "Tesla P100-CUDA", "Tesla V100-CUDA"),
> value = c(75.8, 77.771, 63.297, 78.046, 34.666, 50.052, 32.07,
> 59.815)), class = "data.frame", row.names = c(NA, -8L))
> 
> How do I:
> 
>1. have these two plots next to each other, not on the top of each
other
>2. How do I remove spaces between those bars?
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting factors in graph panel

2023-06-28 Thread PIKAL Petr
Hi

You probably can use any package including base R for such plots.

1. Posting in HTML scrambles your date so they are barely readable.
2. Use dput(head(yourdata, 20)) and copy the output to your mail to show how
your data look like. Although it seems to be not readable, R will consumes
it freely.
3. If I understand correctly you should have an Income column, Percentage
column and Category column. If this is the case

ggplot(yourdata, aes(x=Income, y=Percentage)) + geom_line() +
facet_grid(~Category) 

should give you what you want. But without data it is hard to say.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Anupam Tyagi
> Sent: Wednesday, June 28, 2023 10:34 AM
> To: R-help@r-project.org
> Subject: [R] Plotting factors in graph panel
> 
> Hello,
> 
> I want to plot the following kind of data (percentage of respondents from
a
> survey) that varies by Income into many small *line* graphs in a panel of
> graphs. I want to omit "No Answer" categories. I want to see how each one
of
> the categories (percentages), "None", " Equity", etc. varies by Income.
> How can I do this? How to organize the data well and how to plot? I
thought
> Lattice may be a good package to plot this, but I don't know for sure. I
prefer
> to do this in Base-R if possible, but I am open to ggplot. Any ideas will
be
> helpful.
> 
> Income
> $10 $25 $40 $75 > $75 No Answer
> MF 1 2 3 4 5 9
> None 1 3.05 2.29 2.24 1.71 1.30 2.83
> Equity 2 29.76 28.79 29.51 28.90 31.67 36.77 Debt 3 31.18 32.64 34.31
35.65
> 37.59 33.15 Hybrid 4 36.00 36.27 33.94 33.74 29.44 27.25 Bank AC None 1
> 46.54 54.01 59.1 62.17 67.67 60.87 Current 2 24.75 24.4 25 24.61 24.02
21.09
> Savings 3 25.4 18.7 29 11.48 7.103 13.46 No Answer 9 3.307 2.891 13.4
1.746
> 1.208 4.577
> 
> Thanks.
> --
> Anupam.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Should help of estimate in t.test be corrected?

2023-04-03 Thread PIKAL Petr
Hallo, 
you are probably right that 

"the estimated mean or difference in means depending on whether it was a 
one-sample test or a two-sample test"

should be rephrased to

"the estimated mean or difference in means depending on whether it was a 
one-sample test, two-sample test or two sample paired test"

Cheers
Petr

> -Original Message-
> From: Samuel Granjeaud IR/Inserm 
> Sent: Monday, April 3, 2023 1:25 PM
> To: PIKAL Petr ; r-help@r-project.org
> Subject: Re: [R] Should help of estimate in t.test be corrected?
> 
> Hi
> 
> Thanks for your feedback. I didn't think about that.
> 
> Still, the mean difference is computed for paired, not because there are two
> samples. IMHO, the help should be updated.
> 
> Best,
> Samuel
> 
> Le 2023-04-03 à 12:10, PIKAL Petr a écrit :
> > Hi
> >
> > You need to use paired option
> >
> >> t.test(x=0:4, y=sample(5:9), paired=TRUE)$estimate
> > mean difference
> >   -5
> >
> > Cheers
> > Petr
> >
> >> -Original Message-
> >> From: R-help  On Behalf Of Samuel
> >> Granjeaud IR/Inserm
> >> Sent: Sunday, April 2, 2023 11:39 PM
> >> To: r-help@r-project.org
> >> Subject: [R] Should help of estimate in t.test be corrected?
> >>
> >> Hi,
> >>
> >> Not important, but IMHO the estimate component of the t.test holds an
> >> estimate of mean of each group, never a difference. The doc says
> >> "estimatethe estimated mean or difference in means depending on
> whether
> >> it
> >> was a one-sample test or a two-sample test."
> >>
> >>   > t.test(0:4)$estimate
> >> mean of x
> >>   2
> >>   > t.test(0:4, 5:9)$estimate
> >> mean of x mean of y
> >>   2 7
> >>
> >> Best,
> >> Samuel
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Should help of estimate in t.test be corrected?

2023-04-03 Thread PIKAL Petr
Hi

You need to use paired option

> t.test(x=0:4, y=sample(5:9), paired=TRUE)$estimate
mean difference
 -5

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Samuel Granjeaud
> IR/Inserm
> Sent: Sunday, April 2, 2023 11:39 PM
> To: r-help@r-project.org
> Subject: [R] Should help of estimate in t.test be corrected?
>
> Hi,
>
> Not important, but IMHO the estimate component of the t.test holds an
> estimate of mean of each group, never a difference. The doc says
> "estimatethe estimated mean or difference in means depending on whether 
> it
> was a one-sample test or a two-sample test."
>
>  > t.test(0:4)$estimate
> mean of x
>  2
>  > t.test(0:4, 5:9)$estimate
> mean of x mean of y
>  2 7
>
> Best,
> Samuel
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rprofile.site and automatic installation of missing packages

2023-03-22 Thread PIKAL Petr
Hallo Duncan

Thanks for your hints, yes, the code in Rprofile.site is not executed. I am not 
sure, if I understood the startup mechanism correctly and I am a bit puzzled.

Fresh installation (root C:Program files, write protected) has files 
Rprofile.site and .Rconsole installed in its etc directory.
After starting R I can see that home directory is
> Sys.getenv("R_USER")
[1] "srvudst01.precheza.cz\\userdata\\PikalP\\Dokumenty"

.Rconsole or .Rprofile in this directory are executed but Rprofile.site located 
in this directory is not.

After fresh R installation to set all users to have the same startup I have two 
options:
Change Rprofile.site in ...R\etc directory
Put the same .Rprofile file into R_Home directory for each user

Am I right or are there any other options how to set R on startup for all users 
differently from factory fresh setting?

Sorry for my questions, it is something I never done before but I now need to 
resolve it in a way which fits to our IT environment.

Best regards.
Petr

-Original Message-
From: Duncan Murdoch 
Sent: Tuesday, March 21, 2023 5:43 PM
To: PIKAL Petr ; r-help 
Subject: Re: [R] Rprofile.site and automatic installation of missing packages

On 21/03/2023 9:58 a.m., PIKAL Petr wrote:
> Hallo Duncan
>
> Tested but does not work so something other must be wrong.
>
> R version 4.2.2.
>> installed.packages()[,"Package"]
>base   boot  classcluster  codetools   
> compiler   datasetsforeign   graphics  grDevices  
>  grid KernSmooth
>  "base" "boot""class"  "cluster""codetools"   
>   "compiler" "datasets"  "foreign" "graphics""grDevices"  
>"grid"   "KernSmooth"
> lattice   MASS Matrixmethods   mgcv   
> nlme   nnet   parallel  rpartspatial  
>   splines  stats
>   "lattice" "MASS"   "Matrix"  "methods" "mgcv"   
>   "nlme" "nnet" "parallel""rpart"  "spatial"  
> "splines""stats"
>  stats4   survival  tcltk  tools   translations   
>utils
>"stats4" "survival""tcltk""tools" "translations"   
>  "utils"
>
> My Rprofile.site
> # Things you might want to change
> options(papersize="a4")
> options(help_type="html")
>
> library(utils)
> library(MASS)
>
> #**
> test <-(scan("pack.txt", character(), quote = ""))
> x<- utils::installed.packages()
> utils::install.packages(test[!test %in% x],
> repos="https://cloud.r-project.org;)
>
> ##**
>
> Options are set and working.
> MASS should be loaded but is not
>
>> search()
> [1] ".GlobalEnv""package:stats" "package:graphics"  
> "package:grDevices" "package:utils" "package:datasets"  "package:methods" 
>   "Autoloads" "package:base"
>>
>
> Any suggestion where to look?

I'd add code to print the values of x and test to confirm that things are 
proceeding as you expect.  I don't know if print() or cat() will work there; 
you might need to use message().

For attaching packages, you should see ?Startup again:  this is done via
options() or an environment variable, not library() calls in the profile file.

Duncan Murdoch

>
> Best regards
> Petr
>
> -Original Message-
> From: Duncan Murdoch 
> Sent: Tuesday, March 21, 2023 1:55 PM
> To: PIKAL Petr ; r-help 
> Subject: Re: [R] Rprofile.site and automatic installation of missing
> packages
>
> ?Startup says:  "Note that when the site and user profile files are sourced 
> only the base package is loaded, so objects in other packages need to be 
> referred to by e.g. utils::dump.frames or after explicitly loading the 
> package concerned."
>
> So you need utils::installed.packages and utils::install.packages .
>
> Duncan Murdoch
>
> On 21/03/2023 8:04 a.m., PIKAL Petr wrote:
>> Dear all.
>>
>>
>>
>> I am trying to install missing (not installed) packages during
>> startup of R through code in Rprofile.site but I miserably failed and
>> I am not sure what I am doing wrong.
>>
>>
>>
>> R is installed to C:Program files but it i

Re: [R] Rprofile.site and automatic installation of missing packages

2023-03-21 Thread PIKAL Petr
Hallo Duncan

Tested but does not work so something other must be wrong.

R version 4.2.2.
> installed.packages()[,"Package"]
  base   boot  classcluster  codetools  
 compiler   datasetsforeign   graphics  grDevices   
grid KernSmooth
"base" "boot""class"  "cluster""codetools" 
"compiler" "datasets"  "foreign" "graphics""grDevices" 
"grid"   "KernSmooth"
   lattice   MASS Matrixmethods   mgcv  
 nlme   nnet   parallel  rpartspatial
splines  stats
 "lattice" "MASS"   "Matrix"  "methods" "mgcv"  
   "nlme" "nnet" "parallel""rpart"  "spatial"  
"splines""stats"
stats4   survival  tcltk  tools   translations  
utils
  "stats4" "survival""tcltk""tools" "translations"  
  "utils"

My Rprofile.site
# Things you might want to change
options(papersize="a4")
options(help_type="html")

library(utils)
library(MASS)

#**
test <-(scan("pack.txt", character(), quote = ""))
x<- utils::installed.packages()
utils::install.packages(test[!test %in% x], repos="https://cloud.r-project.org;)

##**

Options are set and working.
MASS should be loaded but is not

> search()
[1] ".GlobalEnv""package:stats" "package:graphics"  
"package:grDevices" "package:utils" "package:datasets"  "package:methods"   
"Autoloads" "package:base"
>

Any suggestion where to look?

Best regards
Petr

-Original Message-
From: Duncan Murdoch 
Sent: Tuesday, March 21, 2023 1:55 PM
To: PIKAL Petr ; r-help 
Subject: Re: [R] Rprofile.site and automatic installation of missing packages

?Startup says:  "Note that when the site and user profile files are sourced 
only the base package is loaded, so objects in other packages need to be 
referred to by e.g. utils::dump.frames or after explicitly loading the package 
concerned."

So you need utils::installed.packages and utils::install.packages .

Duncan Murdoch

On 21/03/2023 8:04 a.m., PIKAL Petr wrote:
> Dear all.
>
>
>
> I am trying to install missing (not installed) packages during startup
> of R through code in Rprofile.site but I miserably failed and I am not
> sure what I am doing wrong.
>
>
>
> R is installed to C:Program files but it is not writable for the
> users, therefore I cannot change Rprofile.site located in root etc
> directory. I however can put Rprofile.site in users home directory
> (Documents) and use it for R startup setting (partly).
>
> However I want for less experienced users to put a code here to check
> installed packages, check if some specified set of packages is
> installed and install them, but it is not working.
>
>
>
> The code in Rprofile.site is:
>
>
>
> #**
>
> test <- scan("pack.txt", character(), quote = "")
>
> inst <- installed.packages()
>
> install.packages(test[!test %in% inst],
> repos="https://cloud.r-project.org;)
>
> #**
>
>
>
> An example of pack.txt is e.g.
>
> ggplot2
>
> zoo
>
>
>
> but the code is not executed and packages are not installed. If I use
> this code after R starts, everything is OK and packages are installed
> to
>
>
>
>> Sys.getenv("R_LIBS_USER")
>
> [1] "C:\\Users\\PikalP\\AppData\\Local/R/win-library/4.2"
>
>>
>
> The same applies if I put e.g. library(MASS) in the Rprofile.site, the
> package is not loaded but after R is live, library(MASS) loads a package.
>
>
>
> So my question is What is the best way to check after fresh R
> installation if some predefined set of packages is installed and if
> not, perform an installation without user intervention in Windows environment?
>
>
>
> S pozdravem | Best Regards
>
> Petr
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Osobní údaje: Informace o zpracování a ochraně 

Re: [R] Rprofile.site and automatic installation of missing packages

2023-03-21 Thread PIKAL Petr
Thanks Duncan

I expected some silly mistake on my side.

Just for completeness, I finally tested .Rprofile file with the code and it 
worked. So now the question is whether to use your suggestion or .Rprofile 
file way. I tend to use Rprofile.site way as it enables experienced users to 
modify their .Rprofile to get customised way how to start R.

Best regards
Petr

> -Original Message-
> From: Duncan Murdoch 
> Sent: Tuesday, March 21, 2023 1:55 PM
> To: PIKAL Petr ; r-help 
> Subject: Re: [R] Rprofile.site and automatic installation of missing 
> packages
>
> ?Startup says:  "Note that when the site and user profile files are sourced 
> only
> the base package is loaded, so objects in other packages need to be referred 
> to
> by e.g. utils::dump.frames or after explicitly loading the package 
> concerned."
>
> So you need utils::installed.packages and utils::install.packages .
>
> Duncan Murdoch
>
> On 21/03/2023 8:04 a.m., PIKAL Petr wrote:
> > Dear all.
> >
> >
> >
> > I am trying to install missing (not installed) packages during startup
> > of R through code in Rprofile.site but I miserably failed and I am not
> > sure what I am doing wrong.
> >
> >
> >
> > R is installed to C:Program files but it is not writable for the
> > users, therefore I cannot change Rprofile.site located in root etc
> > directory. I however can put Rprofile.site in users home directory
> > (Documents) and use it for R startup setting (partly).
> >
> > However I want for less experienced users to put a code here to check
> > installed packages, check if some specified set of packages is
> > installed and install them, but it is not working.
> >
> >
> >
> > The code in Rprofile.site is:
> >
> >
> >
> > #**
> >
> > test <- scan("pack.txt", character(), quote = "")
> >
> > inst <- installed.packages()
> >
> > install.packages(test[!test %in% inst],
> > repos="https://cloud.r-project.org;)
> >
> > #**
> >
> >
> >
> > An example of pack.txt is e.g.
> >
> > ggplot2
> >
> > zoo
> >
> >
> >
> > but the code is not executed and packages are not installed. If I use
> > this code after R starts, everything is OK and packages are installed
> > to
> >
> >
> >
> >> Sys.getenv("R_LIBS_USER")
> >
> > [1] "C:\\Users\\PikalP\\AppData\\Local/R/win-library/4.2"
> >
> >>
> >
> > The same applies if I put e.g. library(MASS) in the Rprofile.site, the
> > package is not loaded but after R is live, library(MASS) loads a package.
> >
> >
> >
> > So my question is What is the best way to check after fresh R
> > installation if some predefined set of packages is installed and if
> > not, perform an installation without user intervention in Windows
> environment?
> >
> >
> >
> > S pozdravem | Best Regards
> >
> > Petr
> >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rprofile.site and automatic installation of missing packages

2023-03-21 Thread PIKAL Petr
Dear all.

 

I am trying to install missing (not installed) packages during startup of R
through code in Rprofile.site but I miserably failed and I am not sure what
I am doing wrong.

 

R is installed to C:Program files but it is not writable for the users,
therefore I cannot change Rprofile.site located in root etc directory. I
however can put Rprofile.site in users home directory (Documents) and use it
for R startup setting (partly). 

However I want for less experienced users to put a code here to check
installed packages, check if some specified set of packages is installed and
install them, but it is not working.

 

The code in Rprofile.site is:

 

#**

test <- scan("pack.txt", character(), quote = "")

inst <- installed.packages()

install.packages(test[!test %in% inst], repos="https://cloud.r-project.org;)

#**

 

An example of pack.txt is e.g.

ggplot2

zoo

 

but the code is not executed and packages are not installed. If I use this
code after R starts, everything is OK and packages are installed to

 

> Sys.getenv("R_LIBS_USER")

[1] "C:\\Users\\PikalP\\AppData\\Local/R/win-library/4.2"

> 

The same applies if I put e.g. library(MASS) in the Rprofile.site, the
package is not loaded but after R is live, library(MASS) loads a package.

 

So my question is What is the best way to check after fresh R installation
if some predefined set of packages is installed and if not, perform an
installation without user intervention in Windows environment?

 

S pozdravem | Best Regards

Petr

ggplot2
zoo
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Shaded area

2023-03-01 Thread PIKAL Petr
Hallo

Excel attachment is not allowed here, but shading area is answered many times 
elsewhere. Use something like . "shading area r" in google.

See eg.
https://www.geeksforgeeks.org/how-to-shade-a-graph-in-r/

Cheers Petr

-Original Message-
From: R-help  On Behalf Of George Brida
Sent: Wednesday, March 1, 2023 3:21 PM
To: r-help@r-project.org
Subject: [R] Shaded area

Dear R users,

I have an xlsx file (attached to this mail) that shows the values of a "der" 
series observed on a daily basis from January 1, 2017 to January 25, 2017. This 
series is strictly positive during two periods: from January 8,
2017 to January 11, 2017 and from January 16, 2017 to January 20, 2017. I would 
like to plot the series with two shaded areas corresponding to the positivity 
of the series. Specifically, I would like to draw 4 vertical lines intersecting 
the x-axis in the 4 dates mentioned above and shade the two areas of 
positivity. Thanks for your help.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify the distribution of the data

2023-02-08 Thread PIKAL Petr
Hi

Others gave you more fundamental answers. To check the possible distribution
you could use package

https://cran.r-project.org/web/packages/fitdistrplus/index.html

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Bogdan Tanasa
> Sent: Wednesday, February 8, 2023 5:35 PM
> To: r-help 
> Subject: [R] identify the distribution of the data
> 
> Dear all,
> 
> I do have dataframes with numerical values such as 1,9, 20, 51, 100 etc
> 
> Which way do you recommend to use in order to identify the type of the
> distribution of the data (normal, poisson, bernoulli, exponential,
log-normal etc
> ..)
> 
> Thanks so much,
> 
> Bogdan
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] preserve class in apply function

2023-02-07 Thread PIKAL Petr
Hi Naresh

If you wanted to automate the function a bit you can use sapply to find
numeric columns
ind <- sapply(mydf, is.numeric)

and use it in apply construct
apply(mydf[,ind], 1, function(row) sum(row))
 [1]  2.13002569  0.63305300  1.48420429  0.13523859  1.17515873 -0.98531131
 [7]  0.47044467  0.23914494  0.26504430  0.02037657

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Naresh Gurbuxani
> Sent: Tuesday, February 7, 2023 1:52 PM
> To: r-help@r-project.org
> Subject: [R] preserve class in apply function
> 
> 
> > Consider a data.frame whose different columns have numeric, character,
> > and factor data.  In apply function, R seems to pass all elements of a
> > row as character.  Is it possible to preserve numeric class?
> >
> >> mydf <- data.frame(x = rnorm(10), y = runif(10))
> >> apply(mydf, 1, function(row) {row["x"] + row["y"]})
> > [1]  0.60150197 -0.74201827  0.80476392 -0.59729280 -0.02980335
> 0.31351909
> > [7] -0.63575990  0.22670658  0.55696314  0.39587314
> >> mydf[, "z"] <- sample(letters[1:3], 10, replace = TRUE)
> >> apply(mydf, 1, function(row) {row["x"] + row["y"]})
> > Error in row["x"] + row["y"] (from #1) : non-numeric argument to binary
> operator
> >> apply(mydf, 1, function(row) {as.numeric(row["x"]) +
> as.numeric(row["y"])})
> > [1]  0.60150194 -0.74201826  0.80476394 -0.59729282 -0.02980338
> 0.31351912
> > [7] -0.63575991  0.22670663  0.55696309  0.39587311
> >> apply(mydf[,c("x", "y")], 1, function(row) {row["x"] + row["y"]})
> > [1]  0.60150197 -0.74201827  0.80476392 -0.59729280 -0.02980335
> 0.31351909
> > [7] -0.63575990  0.22670658  0.55696314  0.39587314
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to calculate the derivatives at each data point?

2023-01-31 Thread PIKAL Petr
Hi Konstantinos

Not exactly derivative but
> diff(df[,2])
[1] -0.01 -0.01 -0.01 -0.01  0.00  0.01 -0.02 -0.03 -0.02

May be enaough for you.

Cheers
Petr

>
> -Original Message-
> From: R-help  On Behalf Of konstantinos
> christodoulou
> Sent: Tuesday, January 31, 2023 10:16 AM
> To: r-help mailing list 
> Subject: [R] How to calculate the derivatives at each data point?
> 
> Hi everyone,
> 
> I have a vector with atmospheric measurements (x-axis) that is
> obtained/calculated at different altitudes (y-axis). The altitude is
uniformly
> distributed every 7 meters.
> For example my dataframe is:
> df <- dataframe(
> *altitude* = c(1005, 1012, 1019, 1026, 1033, 1040, 1047, 1054, 1061,
1068),
> *atm_values* = c(1.41, 1.40, 1.39, 1.38, 1.37, 1.37, 1.38, 1.36, 1.33,
1.31)
>  )
> 
> How can I find the derivatives of the atmospheric measurements at each
> altitude?
> 
> I look forward to hearing from you!
> 
> Thanks,
> Kostas
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question

2023-01-30 Thread PIKAL Petr
Hallo Carolyn

>From what you describe you cannot calculate correlations.

You stated that you have two sets of data, one for December and one for
March and that rows in one set is not related to the rows in another set and
even persons tested in both months do not have their values on the same row.
In that case cor is not appropriate. You should first adjust your data so
that results of those 3 persons are on the same row but even after that only
those 3 values could be evaluated by "cor".

>From what you wrote I think that t.test or similar beast is the way you
should take.

But without same data sample I may be wrong.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Carolyn J Miller
via
> R-help
> Sent: Monday, January 30, 2023 7:16 PM
> To: r-help@r-project.org
> Subject: [R] question
> 
> Hi guys,
> 
> I am using the cor() function to see if there are correlations between
March
> cortisol levels and December cortisol levels and I'm trying to figure out
if the
> function is doing what I want it to do.
> 
> Each sample has it's own separate row in the CSV file that I'm working out
of.
> March Cort and December Cort are different columns and they come from
> separate samples, therefore their values would not be on the same row.
There
> are only 3 individuals that have both December cort values and March
cortisol
> values but they still have different sample ID values (from different
seasons) so
> they are also not on the same row.
> 
>  I ran the function twice: once as cor(cortphcor, use = "complete.obs")
first
> 
> and then cor(cortphcor, use = "pairwise.complete.obs", method =
"pearson").
> 
> I received the same output both times. I guess what I'm asking is, is the
output
> simply the correlation just for those 3 samples or is the second pairwise.
> complete.obs version giving me the correlation for all of the cort samples
for
> March against all of the samples for December despite not being on the
same
> row? I'm trying to figure out how many sample values are contributing to
the
> correlation results I'm getting.
> 
> Thanks,
> 
> Carolyn
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] akima interp results to zero with less than 10 values

2023-01-26 Thread PIKAL Petr
Hallo Duncan

Thanks, I was not aware of this package.  I will try.

Petr

> -Original Message-
> From: Duncan Murdoch 
> Sent: Thursday, January 26, 2023 3:44 PM
> To: PIKAL Petr ; r-help@r-project.org
> Subject: Re: [R] akima interp results to zero with less than 10 values
>
> The akima package has a problematic license (it doesn't allow commercial 
> use),
> so it's been recommended that people use the interp package instead.  When I
> use interp::interp instead of akima::interp, I get reasonable output from 
> your
> example.
>
> So that's another reason to drop akima...
>
> Duncan Murdoch
>
> On 26/01/2023 9:35 a.m., PIKAL Petr wrote:
> > Dear all
> >
> > I have this table
> >> dput(mat)
> > mat <- structure(c(2, 16, 9, 2, 16, 1, 1, 4, 7, 7, 44.52, 42.8, 43.54,
> > 40.26, 40.09), dim = c(5L, 3L))
> >
> > And I want to calculate result for contour or image plots as I did few
> > years ago.
> >
> > However interp does not compute the z values and gives me zeros in z 
> > matrix.
> > library(akima)
> >
> >> interp(mat[,1], mat[,2], mat[, 3], nx=5, ny=5)
> > $x
> > [1]  2.0  5.5  9.0 12.5 16.0
> >
> > $y
> > [1] 1.0 2.5 4.0 5.5 7.0
> >
> > $z
> >   [,1] [,2] [,3] [,4] [,5]
> > [1,]00000
> > [2,]00000
> > [3,]00000
> > [4,]00000
> > [5,]00000
> >
> > With the example from help page if less than 10 values are used, the
> > result is also zero interp(akima$x[1:9], akima$y[1:9], akima$z[1:9],
> > nx=5, ny=5)
> >
> > but with 10 or more values the result is correctly calculated
> > interp(akima$x[1:10], akima$y[1:10], akima$z[1:10], nx=5, ny=5) $x [1]
> > 0.  6.1625 12.3250 18.4875 24.6500
> >
> > $y
> > [1]  1.24  5.93 10.62 15.31 20.00
> >
> > $z
> >   [,1] [,2] [,3] [,4] [,5]
> > [1,]   NA   NA   NA   NA 34.6
> > [2,]   NA   NA 27.29139 27.11807 26.60971
> > [3,]   NA 19.81371 19.63614 19.12778 18.61943
> > [4,]   NA 14.01443 10.66531 11.13750 10.62914
> > [5,]   NA   NA   NA   NA   NA
> >
> > Help page says
> > x, y, and z must be the same length (execpt if x is a
> > SpatialPointsDataFrame) and may contain no fewer than ***four*** points.
> >
> > So my understanding was that 5 poins could be used but I am obviously
> wrong.
> > Is it a bug in interp or in the documentation or is it my poor
> > understanding of the whole matter.
> >
> > Best regards
> > Petr
> >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] akima interp results to zero with less than 10 values

2023-01-26 Thread PIKAL Petr
Dear all

I have this table 
> dput(mat)
mat <- structure(c(2, 16, 9, 2, 16, 1, 1, 4, 7, 7, 44.52, 42.8, 43.54, 
40.26, 40.09), dim = c(5L, 3L))

And I want to calculate result for contour or image plots as I did few years
ago.

However interp does not compute the z values and gives me zeros in z matrix.
library(akima)

> interp(mat[,1], mat[,2], mat[, 3], nx=5, ny=5)
$x
[1]  2.0  5.5  9.0 12.5 16.0

$y
[1] 1.0 2.5 4.0 5.5 7.0

$z
 [,1] [,2] [,3] [,4] [,5]
[1,]00000
[2,]00000
[3,]00000
[4,]00000
[5,]00000

With the example from help page if less than 10 values are used, the result
is also zero
interp(akima$x[1:9], akima$y[1:9], akima$z[1:9], nx=5, ny=5)

but with 10 or more values the result is correctly calculated
interp(akima$x[1:10], akima$y[1:10], akima$z[1:10], nx=5, ny=5)
$x
[1]  0.  6.1625 12.3250 18.4875 24.6500

$y
[1]  1.24  5.93 10.62 15.31 20.00

$z
 [,1] [,2] [,3] [,4] [,5]
[1,]   NA   NA   NA   NA 34.6
[2,]   NA   NA 27.29139 27.11807 26.60971
[3,]   NA 19.81371 19.63614 19.12778 18.61943
[4,]   NA 14.01443 10.66531 11.13750 10.62914
[5,]   NA   NA   NA   NA   NA

Help page says
x, y, and z must be the same length (execpt if x is a
SpatialPointsDataFrame) and may contain no fewer than ***four*** points.

So my understanding was that 5 poins could be used but I am obviously wrong.
Is it a bug in interp or in the documentation or is it my poor understanding
of the whole matter.

Best regards
Petr

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package FactoMineR

2023-01-24 Thread PIKAL Petr
Hallo Sacha

AFAIK the functions in FactoMineR do not enable to manipulate label size. Plot 
is performed by this part:

if (graph & (ncp > 1)) {
print(plot(res, axes = axes))
if (!is.null(quanti.sup))
print(plot(res, choix = "quanti.sup", axes = axes,
new.plot = TRUE))

and the function is not designed to accept additional parameters to manipulate 
size of labels.

I struggled with it few years ago and I decided to use biplot instead as it 
enables more options.

You can contact maintainers if they consider an improvement in future 
versions. Probably simple ... in function definition and print(plot(res, axes 
= axes, ...)) addition could do the trick, but I am not sure.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of varin sacha via R-
> help
> Sent: Monday, January 23, 2023 7:38 PM
> To: r-help@r-project.org
> Subject: [R] package FactoMineR
>
> Dear R-experts,
>
> Here below the R code working (page 8 http://www2.uaem.mx/r-
> mirror/web/packages/FactoMineR/FactoMineR.pdf).
>
> But I am trying to get all the labels (the writes) : comfort, university, 
> economic,
> world, ... smaller. How could I do that ?
>
> Many thanks.
>
> library(FactoMineR)
> data(children)
> res.ca <- CA (children, row.sup = 15:18, col.sup = 6:8)
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Printing special characters

2023-01-16 Thread PIKAL Petr
Hallo Dennis

Is the STRING in R still containing **≥** character?
Or it was converted during reading to R to ...?

What dput(STRING) result i?
 
Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Dennis Fisher
> Sent: Monday, January 16, 2023 9:19 AM
> To: r-help@r-project.org
> Subject: [R] Printing special characters
> 
> R 4.2.2
> OS X
> 
> Colleagues
> 
> A file that I have read includes strings like this:
>   "EVENT ≥ 30 sec"
> When I include the string in a graphic using:
>   mtext(STRING, …)
> it appears as:
>   "EVENT ... 30 sec"
> 
> Is there a simple work-around (short of reformatting all the strings, then 
> using
> plotmath)?
> 
> Dennis
> 
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone / Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg: ggplot error

2023-01-11 Thread PIKAL Petr
Hallo

I am not familiar with any of packages you use (except of MASS and ggplot2) and 
the code is too complicated without any hint where the error could come from 
and what is the message you get. I wonder if anybody would like to go through 
your whole code.

1. data seems to be read correctly
ICUData <- read.csv(file = "ICUData.csv", stringsAsFactors = TRUE)
ICUData.neuro <- ICUData[ICUData$surgery == "neuro",]

2. ggplot(ICUData, aes(x=ICUData.neuro$LOS)) +
  geom_histogram(aes(y=after_stat(density)), binwidth = 5, 
 fill = "darkgrey")

gives me error

Error in `check_aesthetics()`:
! Aesthetics must be either length 1 or the same as the data (500): x
Run `]8;;rstudio:run:rlang::last_error()rlang::last_error()]8;;` to see where 
the error occurred.

which, I believe, resulted from using whole ICUData in ggplot but only 
ICUData.neuro as aes value. Compare:

ggplot(ICUData.neuro, aes(x=ICUData.neuro$LOS)) +
  geom_histogram(aes(y=after_stat(density)), binwidth = 5,
 fill = "darkgrey")

with

ggplot(ICUData, aes(x=ICUData.neuro$LOS)) +
  geom_histogram(aes(y=after_stat(density)), binwidth = 5,
 fill = "darkgrey")

Cheers
Petr

And I do not provide private consulting so keep your posts to R help. Others 
may have much more insightful answers.


From: Upananda Pani  
Sent: Wednesday, January 11, 2023 2:43 PM
To: PIKAL Petr 
Subject: Re: [R] Reg: ggplot error

Hi Respected Member,

Please find attached.

Regards,
Upananda Pani

On Wed, Jan 11, 2023 at 6:07 PM PIKAL Petr <mailto:petr.pi...@precheza.cz> 
wrote:
Hi

Attachments are mostly removed from emails so they probably will not reach
r-help.

You said you get an error, which is the first place you should look at. It
can navigate you to the source of the error if you read it carefully.

Anyway, if your code is complicated it is difficult to understand and
decipher. So think about simplification and maybe you will find the error
source yourself during this simplification.

Cheers
Petr

> -Original Message-
> From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of Upananda Pani
> Sent: Wednesday, January 11, 2023 1:06 PM
> To: Eric Berger <mailto:ericjber...@gmail.com>
> Cc: r-help <mailto:r-help@r-project.org>
> Subject: Re: [R] Reg: ggplot error
> 
> I am sorry.
> 
> On Wed, Jan 11, 2023 at 5:32 PM Eric Berger <mailto:ericjber...@gmail.com> 
> wrote:
> 
> > No code or data came through.
> > Please read the posting guidelines.
> >
> >
> > On Wed, Jan 11, 2023 at 1:38 PM Upananda Pani
> > <mailto:upananda.p...@gmail.com>
> > wrote:
> > >
> > > Dear All,
> > >
> > > I am using roptest  function of package "ROptEst" (Kohl and
> > > Ruckdeschel
> > > (2019)) to find out the ML, CvM-MD, and the RMX estimator and their
> > > asymptotic confidence intervals. I am assuming 1-5% of erroneous
> > > data for the RMX estimator.
> > >
> > > Then I am trying to Plot the data in the form of a histogram and add
> > > the three Gamma distribution densities with the estimated parameters
> > > and validate the three models additionally with pp- and qq-plots.
> > >
> > > I have tried to code it. I have attached the code and data. I am
> > > getting error while fitting ggplot to plot the distribution densities.
> > >
> > > I am doing some error which I am not able to correct. Please help me
> > > to find out my error.
> > >
> > > With sincere regards,
> > > Upananda Pani
> > > __
> > > mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg: ggplot error

2023-01-11 Thread PIKAL Petr
Hi

Attachments are mostly removed from emails so they probably will not reach
r-help.

You said you get an error, which is the first place you should look at. It
can navigate you to the source of the error if you read it carefully.

Anyway, if your code is complicated it is difficult to understand and
decipher. So think about simplification and maybe you will find the error
source yourself during this simplification.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Upananda Pani
> Sent: Wednesday, January 11, 2023 1:06 PM
> To: Eric Berger 
> Cc: r-help 
> Subject: Re: [R] Reg: ggplot error
> 
> I am sorry.
> 
> On Wed, Jan 11, 2023 at 5:32 PM Eric Berger  wrote:
> 
> > No code or data came through.
> > Please read the posting guidelines.
> >
> >
> > On Wed, Jan 11, 2023 at 1:38 PM Upananda Pani
> > 
> > wrote:
> > >
> > > Dear All,
> > >
> > > I am using roptest  function of package "ROptEst" (Kohl and
> > > Ruckdeschel
> > > (2019)) to find out the ML, CvM-MD, and the RMX estimator and their
> > > asymptotic confidence intervals. I am assuming 1-5% of erroneous
> > > data for the RMX estimator.
> > >
> > > Then I am trying to Plot the data in the form of a histogram and add
> > > the three Gamma distribution densities with the estimated parameters
> > > and validate the three models additionally with pp- and qq-plots.
> > >
> > > I have tried to code it. I have attached the code and data. I am
> > > getting error while fitting ggplot to plot the distribution densities.
> > >
> > > I am doing some error which I am not able to correct. Please help me
> > > to find out my error.
> > >
> > > With sincere regards,
> > > Upananda Pani
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date order question

2023-01-04 Thread PIKAL Petr
Hallo Thomas

Similar as suggested by Rui, you shall change your date to real date e.g. by

library(lubridate)
date <- paste(date, c(rep(2022,2), 2023), sep="-")
date <- mdy(date)

and you need to change also x coordinate in annotate.

ggplot(data, aes(x=date,y=PT,group=1))+
  geom_point(size=4)+
  geom_line()+
  geom_hline(yintercept =c(1,.60,0,.30,.25,.2))+
 
scale_y_continuous(label=scales::label_percent(),breaks=c(1,0.6,0,.3,0.25,0.
2))+
  annotate("text", x=date[2], y=.1, label="Very
Good",size=5,fontface="bold")+
  annotate("text", x=date[2], y=.225, label="Good",size=5,fontface="bold")+
  annotate("text", x=date[2], y=.28,
label="Marginal",size=5,fontface="bold") +
  annotate("text", x=date[2], y=.45,
label="Inadequate",size=6,fontface="bold")+
  annotate("text", x=date[2], y=.8, label="OOC",size=6,fontface="bold")+
  annotate("text", x=date[2], y=-.05, label="PT Not
Done",size=5,fontface="bold")

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Thomas Subia
> Sent: Wednesday, January 4, 2023 10:08 PM
> To: r-help@r-project.org
> Subject: [R] Date order question
> 
> Colleagues,
> 
> date<-c("12-29","12-30","01-01")
> PT <- c(.106,.130,.121)
> data <- data.frame(date,PT)
> ggplot(data, aes(x=date,y=PT,group=1))+
>   geom_point(size=4)+
>   geom_line()+
>   geom_hline(yintercept =c(1,.60,0,.30,.25,.2))+
> 
>
scale_y_continuous(label=scales::label_percent(),breaks=c(1,0.6,0,.3,0.25,0.
2))
> +
>   annotate("text", x=2.5, y=.1, label="Very Good",size=5,fontface="bold")+
>   annotate("text", x=2.5, y=.225, label="Good",size=5,fontface="bold")+
>   annotate("text", x=2.5, y=.28, label="Marginal",size=5,fontface="bold")
+
>   annotate("text", x=2.5, y=.45,
label="Inadequate",size=6,fontface="bold")+
>   annotate("text", x=2.5, y=.8, label="OOC",size=6,fontface="bold")+
>   annotate("text", x=2.5, y=-.05, label="PT Not
Done",size=5,fontface="bold")+
>   theme_cowplot()
> 
> The plot has the wrong date order.
> What is desired is 12-29, 12-30 and 01-01.
> 
> Some feedback would be appreciated.
> 
> All the best,
> Thomas Subia
> 
> "De quoi devenir chevre? Des donnees"
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Certification

2023-01-02 Thread PIKAL Petr
Hallo Mukesh

R project is not Microsoft or Oracle AFAIK. But if you need some certificate 
you could take courses on Coursera, they are offering certificates.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Mukesh
> Ghanshyamdas Lekhrajani via R-help
> Sent: Monday, January 2, 2023 1:04 PM
> To: 'Jeff Newmiller' ; 'Mukesh Ghanshyamdas
> Lekhrajani via R-help' ; r-help@r-project.org
> Subject: Re: [R] R Certification
> 
> Hello Jeff !
> 
> Yes, you are right.. and that’s why I am asking this question - just like 
> other
> governing bodies that issue certification on their respective technologies, 
> does
> "r-project.org" also have a learning path ? and then a certification.
> 
> Say - Microsoft issues certificate for C#, .Net, etc..
> Then, Oracle issues certificates for Java, DB etc..
> 
> These are authentic governing bodies for learning and issuing certificates
> 
> On exactly similar lines -  "r-project.org" would also be having some learning
> path and then let "r-project" take the proctored exam and issue a 
> certificate...
> 
> I am not looking at any external institute for certifying me on "R" - but, the
> governing body itself..
> 
> So, the question again is - "does r-project provide a learning path and issue
> certificate after taking exams"
> 
> Thanks, Mukesh
> 9819285174
> 
> 
> 
> -Original Message-
> From: Jeff Newmiller 
> Sent: Monday, January 2, 2023 2:26 PM
> To: mukesh.lekhraj...@yahoo.com; Mukesh Ghanshyamdas Lekhrajani via R-
> help ; r-help@r-project.org
> Subject: Re: [R] R Certification
> 
> I think this request is like saying "I want a unicorn." There are many
> organizations that will enter your name into a certificate form for a fee, 
> possibly
> with some credibility... but if they put "r-project.org" down as the name of 
> the
> organization granting this "certificate" then you are probably getting fooled.
> 
> On December 30, 2022 8:33:09 AM PST, Mukesh Ghanshyamdas Lekhrajani via
> R-help  wrote:
> >Hello R Support Team,
> >
> >
> >
> >I want to do R certification, could you help me with the list of
> >certificates with their prices so it helps me to register.
> >
> >
> >
> >I want to do the certification directly from the governing body
> >"r-project.org" and not from any 3rd party.
> >
> >
> >
> >Please help.
> >
> >
> >
> >
> >
> >
> >
> >Mukesh
> >
> >+91 9819285174
> >
> >
> > [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> --
> Sent from my phone. Please excuse my brevity.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] if documentation

2022-12-07 Thread PIKAL Petr
Hallo Martin

I understand. Anyway, it would not do any harm to add something like

x <- 11
if(x < 0) {print("weird")} else if(x >= 0 & x <= 9) {
print("too low")} else if (x > 9 & x <= 10) {
print("just enough")} else {
print("more than expected")}

to the help page for if.

Regards
Petr

> -Original Message-
> From: Martin Maechler 
> Sent: Wednesday, December 7, 2022 12:16 PM
> To: PIKAL Petr 
> Cc: R-help Mailing List 
> Subject: Re: [R] if documentation
> 
> >>>>> PIKAL Petr
> >>>>> on Wed, 7 Dec 2022 07:04:38 + writes:
> 
> > Hallo all Not sure if it is appropriate place but as I am
> > not involved in r-devel list I post here.
> 
> 
> 
> > Documentation for Control (if, for, while, .) is missing
> > "if else" command.  Although it can be find online
> > elsewhere I believe that adding it either as an example or
> > as a third entry and paragraph about nested if's could be
> > beneficial.
> 
> 
> 
> > if(cond) cons.expr else if (cond) alt.expr else alt2.expr
> 
> > Nested if expressions are better realized with "else if"
> > instead of sequence of plain "else" control statements
> > especially when using several of them.
> 
> I agree.  However there is no "else if" special.
> 
> As everything that *happens* in R is a function call (John Chambers),
indeed, `if`
> is a function too, with an unusual syntax which may involve `else`.
> 
> If you look more closely, `if` is a function with 3 arguments, where the
last (3rd)
> is optional and has a default of NULL :
> 
> Some code towards proving the above :
> 
> 
> (t1 <- if(TRUE)  1:3 ) # is identical to  if(TRUE) 1:3 else NULL
> f1  <- if(FALSE) 1:3   # is identical to
> f2  <- if(FALSE) 1:3 else NULL
> identical(f1,f2)
> f3  <- if(FALSE) 1:3 else 111
> 
> `if`(TRUE, 1:3)
> `if`(TRUE, 1:3, NULL)
> 
> `if`(FALSE, 1:3)   # returns invisibly
> `if`(FALSE, 1:3, NULL)
> `if`(FALSE, 1:3, 111)
> 
> --
> 
> So, 'if(.) else'  or 'if(...) else if(..) ..'
> etc are
> all just versions of calling the `if` function sometimes, in a nested way.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] if documentation

2022-12-06 Thread PIKAL Petr
Hallo all

 

Not sure if it is appropriate place but as I am not involved in r-devel list
I post here.

 

Documentation for Control (if, for, while, .) is missing "if else" command.
Although it can be find online elsewhere I believe that adding it either as
an example or as a third entry and paragraph about nested if's could be
beneficial.

 

if(cond) cons.expr  else if (cond) alt.expr else alt2.expr

 

Nested if expressions are better realized with "else if" instead of sequence
of plain "else" control statements especially when using several of them.

Best regards

Petr

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame returned from sapply but vector expected

2022-11-04 Thread PIKAL Petr
Hallo Ivan

Thanks, yes it seems to be working. I thought also removing NULL by

mylist2[sapply(mylist2, is.null)] <- NULL

but your approach is probably better (in any case simpler)

Thanks again.

Petr

> -Original Message-
> From: Ivan Krylov 
> Sent: Friday, November 4, 2022 1:37 PM
> To: PIKAL Petr 
> Cc: R-help Mailing List 
> Subject: Re: [R] data frame returned from sapply but vector expected
> 
> On Fri, 4 Nov 2022 15:30:27 +0300
> Ivan Krylov  wrote:
> 
> > sapply(mylist2, `[[`, 'b')
> 
> Wait, that would simplify the return value into a matrix when there are no
> NULLs. But lapply(mylist2, `[[`, 'b') should work in both cases, which in
my
> opinion goes to show the dangers of using simplifying functions in
to-be-library
> code.
> 
> Sorry for the double-post!
> 
> --
> Best regards,
> Ivan
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame returned from sapply but vector expected

2022-11-04 Thread PIKAL Petr
Hallo all

 

I found a strange problem for coding if part of list is NULL. 

In this case, sapply result is ***list of data frames*** but if there is no
NULL leaf, the result is ***list of vectors***. 

I tried simplify option but it did not help me neither I found anything in
help page. 

 

The code is part of bigger project where I fill list by reading in data and
if it fails, the leaf is set to NULL. Then the the boxplot is created simply
by

boxplot(sapply(mylist2, "[", "b")) and the user is asked to select if values
should be rbinded or not.

 

Is it possible to perform some *apply without getting data frame as result
in case NULL leaf?

 

Here is an example (without boxplot)

 

df1 <- data.frame(a=rnorm(5), b=runif(5), c=rlnorm(5))

df2 <- data.frame(a=rnorm(5), b=runif(5), c=rlnorm(5))

mylist1 <- list(df1,df2, df3)

mylist2 <- list(NULL,df2, df3)

> str(sapply(mylist1, "[", "b"))

List of 3

$ b: num [1:5] 0.387 0.69 0.876 0.836 0.819

$ b: num [1:5] 0.01733 0.46055 0.19421 0.11609 0.00789

$ b: num [1:5] 0.593 0.478 0.299 0.185 0.847

> str(sapply(mylist2, "[", "b"))

List of 3

$ : NULL

$ :'data.frame':   5 obs. of  1 variable:

  ..$ b: num [1:5] 0.01733 0.46055 0.19421 0.11609 0.00789

$ :'data.frame':   5 obs. of  1 variable:

  ..$ b: num [1:5] 0.593 0.478 0.299 0.185 0.847

 

S pozdravem | Best Regards

RNDr. Petr PIKAL
Vedoucí Výzkumu a vývoje | Research Manager

PRECHEZA a.s.
nábř. Dr. Edvarda Beneše 1170/24 | 750 02 Přerov | Czech Republic
Tel: +420 581 252 256 | GSM: +420 724 008 364
  petr.pi...@precheza.cz |
 www.precheza.cz

Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
partnerů PRECHEZA a.s. jsou zveřejněny na:

https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about
processing and protection of business partner's personal data are available
on website:

https://www.precheza.cz/en/personal-data-protection-principles/

Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné
a podléhají tomuto právně závaznému prohlášení o vyloučení odpovědnosti:
 https://www.precheza.cz/01-dovetek/ |
This email and any documents attached to it may be confidential and are
subject to the legally binding disclaimer:

https://www.precheza.cz/en/01-disclaimer/

 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Associate a .R file with the RGui

2022-11-04 Thread PIKAL Petr
Hi 

another option is to use SendTo folder. This was easy to find in older
Windows versions. Now the easiest way is:

Press the Windows Key + R to trigger the Run window. At the Open field in
the window, type shell:SendTo and then click OK

And SendTo folder should open.

You can then add link to any application here so add RGui link

After that, rightclick on file .RData file and menu with option SendTo opens
with RGui one of the options.

I learned it before ages and it is quite usefull not only in this case.

Here is link to another explanation.

https://www.pcmag.com/how-to/how-to-customize-the-send-to-menu-in-windows

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Andrew Simmons
> Sent: Friday, November 4, 2022 10:09 AM
> To: Amarjit Chandhial 
> Cc: R-help Mailing List 
> Subject: Re: [R] Associate a .R file with the RGui
> 
> In an R session, run this:
> 
> writeLines(normalizePath(R.home("bin")))
> 
> Right click your .R file > Open with > Choose another app > Check the box
> "Always use this app to open .R files" > Look for another app on this PC
Paste
> the directory found above, then select "Rgui.exe"
> 
> On Fri, Nov 4, 2022, 04:49 Amarjit Chandhial via R-help < r-help@r-
> project.org> wrote:
> 
> >
> > Hi,
> >
> >
> > My OS is Windows 11 Pro 64-Bit, I have R 4.2.2 and RStudio installed.
> >
> > If I double-click on a .R file in File Explorer the OS gives me the
> > option of opening the .R in RStudio, or Look for an app in the
> > Microsoft Store, or More Apps. Similarly with a right-click.
> >
> > I would like to associate a .R file with the RGui, not RStudio, thus
> > when I double-click on a .R file in File Explorer the .R file opens in
> > the R Editor in RGui.
> >
> > On my PC R 4.2.2 is located in "C:/Program Files/R/R-4.2.2/etc"
> >
> > Please can someone provide step-by-step instructions on how to
> > associate?
> >
> >
> > thanks,
> > Amarjit
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ids with matching number combinations?

2022-10-07 Thread PIKAL Petr
Hallo Marine

Could you please make your example more reproducible by using set.seed (and
maybe smaller)?

If I understand correctly, you want to know if let say row 1 items from df2
(8,16) are both in item column of specific id?

If I am correct in guessing, I cannot find another solution than split your
df according to id
x <- split(df, df$id)[[1]]

and for each row of df2 test if within the specified id you can find both
numbers.
sum(is.element(df2[1,], x$item))==2
[1] FALSE

So basically 2 cycles, one for df ids and the other for df2 rows.

But maybe somebody will give you more ingenious answer.

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Marine Andersson
> Sent: Friday, October 7, 2022 1:58 PM
> To: r-help@r-project.org
> Subject: [R] Ids with matching number combinations?
> 
> Hi,
> 
> If I have two datasets like this:
> df=data.frame("id"=rep(1:10,10, each=10), "item1"=sample(1:20, 100,
> replace=T)
> df2=data.frame("a"=c(8, 8,10,9, 5, 1,2,1), "b"=c(16,18,11, 19,18,
11,17,12))
> 
> How do I find out which ids in the df dataset that has a match for both
the
> numbers occuring in the same row in the df2 dataframe? In the output I
would
> like to get the matching id and the rownumber from the df2.
> 
> Output something like this
> IdRownr
> 2 1
> 5 1
> 7 4
> 
> My actual problem is more complex with even more columns to be matched and
> the datasets are large, hence the solution needs to be efficient.
> 
> Kind regards,
> 
> 
> 
> 
> 
> N?r du skickar e-post till Karolinska Institutet (KI) inneb?r detta att KI
kommer
> att behandla dina personuppgifter. H?r finns information om hur KI
behandlar
> personuppgifter.
> 
> 
> Sending email to Karolinska Institutet (KI) will result in KI processing
your
> personal data. You can read more about KI's processing of personal data
> here.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R version 4.2.1 install.packages does not work with IE proxy setting

2022-10-06 Thread PIKAL Petr
Hallo Howard

Thanks, for the time being the previous workaround with wininet options
works for me. I will wait for the new R version and then try to persuade our
IT to make relevant changes.

Cheers
Petr

> -Original Message-
> From: Howard, Tim G (DEC) 
> Sent: Thursday, October 6, 2022 1:09 PM
> To: r-help@r-project.org; PIKAL Petr 
> Subject: Re: [R]  R version 4.2.1 install.packages does not work with IE
proxy
> setting
> 
> Petr,
> You also might want to check out the bug reported here:
> 
> https://bugs.r-project.org/show_bug.cgi?id=18379
> 
> it was fixed and the last two comments discuss how to handle it in
Windows:
> 
> you add a new User Environment Variable:
> R_LIBCURL_SSL_REVOKE_BEST_EFFORT and set it to TRUE
> 
> This fix is in R-4.2.1 Patched (I don't know if it has made it out to the
full
> distribution) and works in my 'corporate' environment.  Perhaps it also
applies
> to your environment.
> 
> Tim
> 
> 
> Date: Wed, 5 Oct 2022 10:34:02 +
> From: PIKAL Petr 
> To: Ivan Krylov 
> Cc: r-help mailing list 
> Subject: Re: [R]  R version 4.2.1 install.packages does not work with
>     IE proxy setting
> Message-ID:
>     <9b38aacf51d746bb87a9cc3765a16...@srvexchcm1302.precheza.cz>
> Content-Type: text/plain; charset="us-ascii"
> 
> Thanks,
> 
> the workaround works but we need try the "permanent" solution with
> Renviron.site file in future.
> 
> Cheers
> Petr
> 
> > -Original Message-
> > From: Ivan Krylov 
> > Sent: Tuesday, October 4, 2022 5:43 PM
> > To: PIKAL Petr 
> > Cc: r-help mailing list 
> > Subject: Re: [R] R version 4.2.1 install.packages does not work with
> > IE
> proxy
> > setting
> >
> > On Tue, 4 Oct 2022 11:01:14 +
> > PIKAL Petr  wrote:
> >
> > > After we installed new R version R 4.2.1 installing packages through
> > > IE proxy setting is compromised with warning that R could not
> > > connect to server (tested in vanilla R).
> >
> > R 4.1 deprecated the use of download.file(method = 'wininet'). R 4.2
> switched
> > the default download method to 'libcurl' and started giving warnings
> > for 'wininet' and http[s]:// URLs.
> >
> > A workaround to get it working right now would be to set
> > options(download.file.method = 'wininet') and live with the resulting
> warnings
> > while R downloads the files, but a next version of R may remove
'wininet'
> > support altogether.
> >
> > In order to get it working with the 'libcurl' method, you'll need to
> provide some
> > environment variables to curl:
> > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> > .ethz.ch%2Fpipermail%2Fr-help%2F2022-
> September%2F475917.htmldata=
> >
> 05%7C01%7Ctim.howard%40dec.ny.gov%7C1b8601d6e602486b347508daa7
> 81d26e%7
> >
> Cf46cb8ea79004d108ceb80e8c1c81ee7%7C0%7C0%7C63800647328501003
> 4%7CUnkno
> >
> wn%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiL
> >
> CJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=7w1P6Sfu7V3DUM4a3Ez
> gmf87bXn8Cn
> > C%2FipTvXfBpI0c%3Dreserved=0
> >
> > Not sure if libcurl would accept a patch to discover the Windows proxy
> settings
> > automatically, but I don't think it does that now.
> >
> > --
> > Best regards,
> > Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R version 4.2.1 install.packages does not work with IE proxy setting

2022-10-05 Thread PIKAL Petr
Thanks, 

the workaround works but we need try the "permanent" solution with
Renviron.site file in future.

Cheers
Petr

> -Original Message-
> From: Ivan Krylov 
> Sent: Tuesday, October 4, 2022 5:43 PM
> To: PIKAL Petr 
> Cc: r-help mailing list 
> Subject: Re: [R] R version 4.2.1 install.packages does not work with IE
proxy
> setting
> 
> On Tue, 4 Oct 2022 11:01:14 +
> PIKAL Petr  wrote:
> 
> > After we installed new R version R 4.2.1 installing packages through
> > IE proxy setting is compromised with warning that R could not connect
> > to server (tested in vanilla R).
> 
> R 4.1 deprecated the use of download.file(method = 'wininet'). R 4.2
switched
> the default download method to 'libcurl' and started giving warnings for
> 'wininet' and http[s]:// URLs.
> 
> A workaround to get it working right now would be to set
> options(download.file.method = 'wininet') and live with the resulting
warnings
> while R downloads the files, but a next version of R may remove 'wininet'
> support altogether.
> 
> In order to get it working with the 'libcurl' method, you'll need to
provide some
> environment variables to curl:
> https://stat.ethz.ch/pipermail/r-help/2022-September/475917.html
> 
> Not sure if libcurl would accept a patch to discover the Windows proxy
settings
> automatically, but I don't think it does that now.
> 
> --
> Best regards,
> Ivan
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R version 4.2.1 install.packages does not work with IE proxy setting

2022-10-04 Thread PIKAL Petr
Dear all

After we installed new R version R 4.2.1 installing packages through IE
proxy setting is compromised with warning that R could not connect to server
(tested in vanilla R).

> chooseCRANmirror()
Warning: failed to download mirrors file (cannot open URL
'https://cran.r-project.org/CRAN_mirrors.csv'); using local file
'D:/programy/R/doc/CRAN_mirrors.csv'
Warning message:
In download.file(url, destfile = f, quiet = TRUE) :
  URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't
connect to server'

So install.packages("whatever") also ends with similar warnings.

When using directly (without proxy need) install.packages works as expected.

In R4.1.0 version it works as expected, without any warnings and packages
could be installed smoothly.

Is it known issue with this version? 
Could it be due to somehow corrupted installation?
Should we set something differently for this new R version

Best regards
Petr
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine two dataframe with different row number and interpolation between values

2022-08-31 Thread PIKAL Petr
Hallo

And missing value interpolation is rather tricky business dependent on what
is underlying process.

Maybe na.locf from zoo package?

Or approxfun?, splinefun?

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of javad bayat
> Sent: Wednesday, August 31, 2022 8:09 AM
> To: r-help@r-project.org
> Subject: [R] Combine two dataframe with different row number and
> interpolation between values
> 
>  Dear all,
> I am trying to combine two large dataframe in order to make a dataframe
with
> exactly the dimension of the second dataframe.
> The first df is as follows:
> 
> df1 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 2920), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=8),
>   h = rep(c(seq(3,24, by = 3),seq(3,24, by = 3),seq(3,24, by =
3),seq(3,24, by =
> 3),seq(3,24, by = 3)),365),
>   ws = rnorm(1:14600, mean=20))
> > head(df1)
>  y   d   hws
> 1  2010  1  3 20.71488
> 2  2010  1  6 19.70125
> 3  2010  1  9 21.00180
> 4  2010  1 12 20.29236
> 5  2010  1 15 20.12317
> 6  2010  1 18 19.47782
> 
> The data in the "ws" column were measured with 3 hours frequency and I
need
> data with one hour frequency. I have made a second df as follows with one
hour
> frequency for the "ws" column.
> 
> df2 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 8760), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=24),
>   h = rep(c(1:24,1:24,1:24,1:24,1:24),365), ws = "NA")
> > head(df2)
>   y  dh   ws
> 1  2010  11   NA
> 2  2010  12   NA
> 3  2010  13   NA
> 4  2010  14   NA
> 5  2010  15   NA
> 6  2010  16   NA
> 
> What I am trying to do is combine these two dataframes so as to the rows
in
> df1 (based on the values of "y", "d", "h" columns) that have values
exactly
> similar to df2's rows copied in its place in the new df (df3).
> For example, in the first dataframe the first row was measured at 3
o'clock on
> the first day of 2010 and this row must be placed on the third row of the
second
> dataframe which has a similar value (2010, 1, 3). Like the below
> table:
>   y  dh   ws
> 1  2010  11   NA
> 2  2010  12   NA
> 3  2010  13   20.71488
> 4  2010  14   NA
> 5  2010  15   NA
> 6  2010  16   19.70125
> 
> But regarding the values of the "ws" column for df2 that do not have value
(at 4
> and 5 o'clock), I need to interpolate between the before and after values
to fill in
> the missing data of the "ws".
> I have tried the following codes but they did not work correctly.
> 
> > df3 = merge(df1, df2, by = "y")
> Error: cannot allocate vector of size 487.9 Mb or
> > library(dplyr)
> > df3<- df1%>% full_join(df2)
> 
> 
> Is there any way to do this?
> Sincerely
> 
> 
> 
> 
> 
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: bayat...@yahoo.com
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine two dataframe with different row number and interpolation between values

2022-08-31 Thread PIKAL Petr
Hallo

Merge

df3 <- merge(df1, df2, all=T)
> head(df3)
 y d h   ws
1 2010 1 1   NA
2 2010 1 2   NA
3 2010 1 3 20.4631367884005
4 2010 1 3   NA
5 2010 1 4   NA
6 2010 1 5   NA
Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of javad bayat
> Sent: Wednesday, August 31, 2022 8:09 AM
> To: r-help@r-project.org
> Subject: [R] Combine two dataframe with different row number and
> interpolation between values
> 
>  Dear all,
> I am trying to combine two large dataframe in order to make a dataframe
> with exactly the dimension of the second dataframe.
> The first df is as follows:
> 
> df1 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 2920), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=8),
>   h = rep(c(seq(3,24, by = 3),seq(3,24, by = 3),seq(3,24, by =
> 3),seq(3,24, by = 3),seq(3,24, by = 3)),365),
>   ws = rnorm(1:14600, mean=20))
> > head(df1)
>  y   d   hws
> 1  2010  1  3 20.71488
> 2  2010  1  6 19.70125
> 3  2010  1  9 21.00180
> 4  2010  1 12 20.29236
> 5  2010  1 15 20.12317
> 6  2010  1 18 19.47782
> 
> The data in the "ws" column were measured with 3 hours frequency and I
need
> data with one hour frequency. I have made a second df as follows with one
> hour frequency for the "ws" column.
> 
> df2 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 8760), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=24),
>   h = rep(c(1:24,1:24,1:24,1:24,1:24),365), ws = "NA")
> > head(df2)
>   y  dh   ws
> 1  2010  11   NA
> 2  2010  12   NA
> 3  2010  13   NA
> 4  2010  14   NA
> 5  2010  15   NA
> 6  2010  16   NA
> 
> What I am trying to do is combine these two dataframes so as to the rows
in
> df1 (based on the values of "y", "d", "h" columns) that have values
exactly
> similar to df2's rows copied in its place in the new df (df3).
> For example, in the first dataframe the first row was measured at 3
o'clock
> on the first day of 2010 and this row must be placed on the third row of
> the second dataframe which has a similar value (2010, 1, 3). Like the
below
> table:
>   y  dh   ws
> 1  2010  11   NA
> 2  2010  12   NA
> 3  2010  13   20.71488
> 4  2010  14   NA
> 5  2010  15   NA
> 6  2010  16   19.70125
> 
> But regarding the values of the "ws" column for df2 that do not have value
> (at 4 and 5 o'clock), I need to interpolate between the before and after
> values to fill in the missing data of the "ws".
> I have tried the following codes but they did not work correctly.
> 
> > df3 = merge(df1, df2, by = "y")
> Error: cannot allocate vector of size 487.9 Mb
> or
> > library(dplyr)
> > df3<- df1%>% full_join(df2)
> 
> 
> Is there any way to do this?
> Sincerely
> 
> 
> 
> 
> 
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: bayat...@yahoo.com
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need to insert various rows of data from a data frame after particular rows from another dataframe

2022-07-27 Thread PIKAL Petr
Hi.

 

„is not working“ is extremelly vague.

 

1.

What do you expect this code do?

 

for(cr in seq_along(dacnet_17$district)){

match(arhar_18$district, dacnet_17$district)

}

 

See ?match and maybe also ?“for“ and try this

 

x <- letters[1:5]

y <- sample(letters, 100, replace=T)

match(x,y)

[1] 45 16 24 13 71

for(i in 1:3) match(x,y)

 

2.

rbind works as expeceted

 

arhar_18 <- data.frame(a=1:10, b=50, c=letters[1:10])

dacnet_17 <- data.frame(a=11:20, b=100, c=sample(letters,10))

df3=rbind(arhar_18,dacnet_17)

 

if your data has common column order and type.

 

To get more specific answer you need to ask specific question preferably with 
some data included (most preferably by dput command) and error message.

 

Cheers

Petr

 

 

From: Ranjeet Kumar Jha  
Sent: Wednesday, July 27, 2022 8:35 AM
To: PIKAL Petr 
Cc: R-help 
Subject: Re: [R] Need to insert various rows of data from a data frame after 
particular rows from another dataframe

 

Hi Petr,

 

I used r-bind but it's not working.

Here is the code:

 

arhar_18<-read.csv("D:/Ranjeet/IAMV6/input/yield/kharif_18-19_yield/Kharif_2018/arhar_18.csv")

dacnet_17<-read.csv("D:/Ranjeet/IAMV6/input/yield/dacnet_yield_update till 
2019.csv")

 

for(cr in seq_along(dacnet_17$district)){

match(arhar_18$district, dacnet_17$district)

}

 

df3=rbind(arhar_18,dacnet_17)

df3=df3[order(df3$district,df3$year),]

x<-write.csv(df3,"df3.csv")

view(x)

 

On Wed, Jul 27, 2022 at 12:00 PM PIKAL Petr mailto:petr.pi...@precheza.cz> > wrote:

Hi.

>From what you say, plain "rbind" could be used, if the columns in both sets
are the same and in the same order. After that you can reorder the resulting
data frame as you wish by "order". AFAIK for most functions row order in
data frame does not matter.

Cheers
Petr

> -Original Message-
> From: R-help  <mailto:r-help-boun...@r-project.org> > On Behalf Of Ranjeet Kumar Jha
> Sent: Monday, July 25, 2022 3:03 PM
> To: R-help mailto:r-help@r-project.org> >
> Subject: [R] Need to insert various rows of data from a data frame after
> particular rows from another dataframe
> 
> Hello Everyone,
> 
> I have dataset in a particular format in "dacnet_yield_update till
2019.xlsx" file,
> where I need to insert the data of rows 2018-2019 and
> 2019-2020 for the districts those data are available in "Kharif crops
yield_18-
> 19.xlsx".  I need to insert these two rows of data belonging to every
district, if
> data is available in a later excel file, just after the particular crop
group data for
> the particular district.
> 
> I have put the data file in the given link.
> https://drive.google.com/drive/u/0/folders/1dNmGTI8_c9PK1QqmfIjnpbyzuiC
> XgxFC
> 
> Please help solving this problem.
> 
> Regards and Thanks,
> Ranjeet
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org <mailto:R-help@r-project.org>  mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




 

-- 

Ranjeet  Kumar Jha, M.Tech. (IIT Kharagpur), Ph.D. (USA)

https://www.linkedin.com/in/ranjeet-kumar-jha-ph-d-usa-73a5aa56

---
Email:  <mailto:ranjeetjhaiit...@gmail.com> ranjeetjhaiit...@gmail.com





"Simple Heart, Humble Attitude and Surrender to Supreme Being make our lives 
beautiful!"



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need to insert various rows of data from a data frame after particular rows from another dataframe

2022-07-27 Thread PIKAL Petr
Hi.

>From what you say, plain "rbind" could be used, if the columns in both sets
are the same and in the same order. After that you can reorder the resulting
data frame as you wish by "order". AFAIK for most functions row order in
data frame does not matter.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Ranjeet Kumar Jha
> Sent: Monday, July 25, 2022 3:03 PM
> To: R-help 
> Subject: [R] Need to insert various rows of data from a data frame after
> particular rows from another dataframe
> 
> Hello Everyone,
> 
> I have dataset in a particular format in "dacnet_yield_update till
2019.xlsx" file,
> where I need to insert the data of rows 2018-2019 and
> 2019-2020 for the districts those data are available in "Kharif crops
yield_18-
> 19.xlsx".  I need to insert these two rows of data belonging to every
district, if
> data is available in a later excel file, just after the particular crop
group data for
> the particular district.
> 
> I have put the data file in the given link.
> https://drive.google.com/drive/u/0/folders/1dNmGTI8_c9PK1QqmfIjnpbyzuiC
> XgxFC
> 
> Please help solving this problem.
> 
> Regards and Thanks,
> Ranjeet
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] split apply on multiple variables

2022-07-01 Thread PIKAL Petr
Hi

Maybe
lapply(mydf.split, function(x) aggregate(x[,4:5], list(x$date), sum))

For your particular case, but lacking overall genarality.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Naresh Gurbuxani
> Sent: Friday, July 1, 2022 1:08 PM
> To: r-help@r-project.org
> Subject: [R] split apply on multiple variables
> 
> 
> I am looking for a more general solution to below exercise.
> 
> Thanks,
> Naresh
> 
> library(plyr)
> mydf <- data.frame(
> date = rep(seq.Date(from = as.Date("2022-06-01"), by = 1, length.out =
> 10), 4),
> account = c(rep("ABC", 20), rep("XYZ", 20)),
> client = c(rep("P", 10), rep("Q", 10), rep("R", 10), rep("S", 10)),
> profit = round(runif(40, 2, 5), 2), sales = round(runif(40, 10, 20), 2))
> 
> mydf.split <- split(mydf, mydf$account)
> 
> # if there are 10 variables like sales, profit, etc., need 10 lines
> myres <- lapply(mydf.split, function(df) {
> sales.ts <- aggregate(sales ~ date, FUN = sum, data = df) #one step for
both?
> profit.ts <- aggregate(profit ~ date, FUN = sum, data = df)
> merge(profit.ts, sales.ts, by = "date")})
> 
> myres.df <- ldply(myres)
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dplyr question

2022-06-23 Thread PIKAL Petr
Hallo all

Various suggestion were made but with this simple task I immediatelly
thought about reshape2 melt function.

x <- "Time_stampP1A0B0D P190-90D
'Jun-10 10:34'  -0.000208   -0.000195
'Jun-10 10:51'  -0.000228   -0.000188
'Jun-10 11:02'  -0.000234   -0.000204
'Jun-10 11:17'  -0.00022-0.000205
'Jun-10 11:25'  -0.000238   -0.000195"
df1 <- read.table(textConnection(x), header = TRUE, check.names = FALSE)

library(reshape2)
melt(df1)
Using Time_stamp as id variables
 Time_stamp variable value
1  Jun-10 10:34  P1A0B0D -0.000208
2  Jun-10 10:51  P1A0B0D -0.000228
3  Jun-10 11:02  P1A0B0D -0.000234
4  Jun-10 11:17  P1A0B0D -0.000220
5  Jun-10 11:25  P1A0B0D -0.000238
6  Jun-10 10:34 P190-90D -0.000195
7  Jun-10 10:51 P190-90D -0.000188
8  Jun-10 11:02 P190-90D -0.000204
9  Jun-10 11:17 P190-90D -0.000205
10 Jun-10 11:25 P190-90D -0.000195

You need only rename columns if necessary.
Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Richard O'Keefe
> Sent: Thursday, June 23, 2022 2:29 AM
> To: Thomas Subia 
> Cc: r-help@r-project.org
> Subject: Re: [R] Dplyr question
> 
> Why do you want to use dplyr?
> It's easy using base R.
> 
> original <- ...
> a <- cbind(original[,-3], Location=colnames(original)[2]) colnames(a)[2]
<-
> "Measurement"
> b <- cbind(original[,-2], Location=colnames(original)[3]) colnames(b)[2]
<-
> "Measurement"
> result <- rbind(a, b)[,c(1,3,2)]
> 
> 
> 
> 
> On Wed, 22 Jun 2022 at 04:23, Thomas Subia
> 
> wrote:
> 
> > Colleagues:
> >
> > The header of my data set is:
> > Time_stamp  P1A0B0D P190-90D
> > Jun-10 10:34-0.000208   -0.000195
> > Jun-10 10:51-0.000228   -0.000188
> > Jun-10 11:02-0.000234   -0.000204
> > Jun-10 11:17-0.00022-0.000205
> > Jun-10 11:25-0.000238   -0.000195
> >
> > I want my data set to resemble:
> >
> > Time_stamp  LocationMeasurement
> > Jun-10 10:34P1A0B0D -0.000208
> > Jun-10 10:51P1A0B0D -0.000228
> > Jun-10 11:02P1A0B0D -0.000234
> > Jun-10 11:17P1A0B0D -0.00022
> > Jun-10 11:25P1A0B0D -0.000238
> > Jun-10 10:34P190-90D-0.000195
> > Jun-10 10:51P190-90D-0.000188
> > Jun-10 11:02P190-90D-0.000204
> > Jun-10 11:17P190-90D-0.000205
> > Jun-10 11:25P190-90D-0.000195
> >
> > I need some advice on how to do this using dplyr.
> >
> > V/R
> > Thomas Subia
> >
> > FM Industries, Inc. - NGK Electronics, USA | www.fmindustries.com
> > 221 Warren Ave, Fremont, CA 94539
> >
> > "En Dieu nous avons confiance, tous les autres doivent apporter des
> > donnees"
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] missing values error in if statement

2022-05-20 Thread PIKAL Petr
Hm,

what do **you** mean by fraction

This is what you posted

> >> >Error in if (fraction <= 1) { : missing value where TRUE/FALSE
> >> >needed
On May 19, 2022 2:30:58 PM PDT, Neha gupta

I just showed that if object **fraction** is NA, it results exactly in the 
error you posted. From where is this object I do not know.

>From help page
protected
factor, protected variable (also called sensitive attribute), containing 
privileged and unprivileged groups

should be factor, which is not. Maybe it does not matter but you could try to 
change it to one by as.factor function.

You probably modified the code from help page but in that case you should 
check if all your objects are the same mode and structure as the objects in 
help page code.

Cheers
Petr

From: Neha gupta 
Sent: Friday, May 20, 2022 3:22 PM
To: PIKAL Petr 
Cc: r-help mailing list 
Subject: Re: [R] missing values error in if statement

What do you mean by "fraction" ?

traceback()
4: readable_number(max_value - min_value, FALSE)
3: get_nice_ticks(lower_bound, upper_bound)
2: plot.fairness_object(fc)
1: plot(fc)

On Fri, May 20, 2022 at 3:18 PM PIKAL Petr <mailto:petr.pi...@precheza.cz> 
wrote:
Hallo

>From what you say the error comes from

> fraction <- NA
> if(fraction <= 1) print(5)
Error in if (fraction <= 1) print(5) :
  missing value where TRUE/FALSE needed
>
so somewhere fraction is set to NA during your code.

I would consult traceback, you could try debug used functions but maybe you
should start with explainer, prot and privileged, if they are as expected by
fairness_check

> > fc= fairness_check(explainer,
> >   protected = prot,
> >privileged = privileged)

Cheers
Petr

> -Original Message-
> From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of Neha gupta
> Sent: Friday, May 20, 2022 3:03 PM
> To: Jeff Newmiller <mailto:jdnew...@dcn.davis.ca.us>
> Cc: r-help mailing list <mailto:r-help@r-project.org>
> Subject: Re: [R] missing values error in if statement
>
> Actually I am not very sure where exactly the error raised but when I run
the
> plot(fc) , it shows the error.
>
> I checked it online and people suggested that it may come with missing
> values in 'if' or 'while; statements etc.
>
> I do not know how your code works and mine not.
>
> Best regards
>
> On Fri, May 20, 2022 at 10:16 AM Neha gupta
> <mailto:neha.bologn...@gmail.com>
> wrote:
>
> > I am sorry.. The code is here and data is provided at the end of this
> > email.
> >
> > data = readARFF("aho.arff")
> >
> > index= sample(1:nrow(data), 0.7*nrow(data)) train= data[index,] test=
> > data[-index,]
> >
> > task = TaskClassif$new("data", backend = train, target = "isKilled")
> > learner= lrn("classif.randomForest", predict_type = "prob") model=
> > learner$train(task )
> >
> > ///explainer is created to identify a bias in a particular feature
> > i.e. CE feature in this case
> >
> > explainer = explain_mlr3(model,
> >  data = test[,-15],
> >  y = as.numeric(test$isKilled)-1,
> >  label="RF")
> > prot <- ifelse(test$CE == '2', 1, 0)   /// Error comes here
> > privileged <- '1'
> >
> >
> > fc= fairness_check(explainer,
> >   protected = prot,
> >privileged = privileged)
> > plot(fc)
> >
> >
> > // my data is
> >
> > dput(test)
> > structure(list(DepthTree = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> > 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 2, 1), NumSubclass = c(0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 2), McCabe = c(1, 1, 1, 3, 3, 3, 3, 1, 2, 3, 3, 3,
> > 3, 3,

Re: [R] missing values error in if statement

2022-05-20 Thread PIKAL Petr
Hi 

Strange, you say

> prot <- ifelse(test$CE == '2', 1, 0)   /// Error comes here

but with your data

ifelse(test$CE == '2', 1, 0)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1
 [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0
 [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0
[112] 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 1 1

the code runs smoothly without error.

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Neha gupta
> Sent: Friday, May 20, 2022 10:16 AM
> To: Jeff Newmiller 
> Cc: r-help mailing list 
> Subject: Re: [R] missing values error in if statement
> 
> I am sorry.. The code is here and data is provided at the end of this
> email.
> 
> data = readARFF("aho.arff")
> 
> index= sample(1:nrow(data), 0.7*nrow(data))
> train= data[index,]
> test= data[-index,]
> 
> task = TaskClassif$new("data", backend = train, target = "isKilled")
> learner= lrn("classif.randomForest", predict_type = "prob")
> model= learner$train(task )
> 
> ///explainer is created to identify a bias in a particular feature i.e. CE
> feature in this case
> 
> explainer = explain_mlr3(model,
>  data = test[,-15],
>  y = as.numeric(test$isKilled)-1,
>  label="RF")
> prot <- ifelse(test$CE == '2', 1, 0)   /// Error comes here
> privileged <- '1'
> 
> 
> fc= fairness_check(explainer,
>   protected = prot,
>privileged = privileged)
> plot(fc)
> 
> 
> // my data is
> 
> dput(test)
> structure(list(DepthTree = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,
> 2, 2, 2, 1, 1, 1, 1, 2, 1), NumSubclass = c(0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2), McCabe = c(1, 1, 1,
> 3, 3, 3, 3, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 1, 2, 2, 1,
> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5,
> 5, 5, 5, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5,
> 5, 5, 5, 5, 5, 5, 2, 2, 2, 2, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
> 5, 5, 5, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 4, 4, 1, 1, 2, 2, 2, 2,
> 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 1), LOC = c(3,
> 3, 4, 10, 10, 10, 10, 4, 5, 22, 22, 22, 22, 22, 22, 22, 22, 3,
> 3, 3, 3, 8, 8, 4, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
> 23, 23, 23, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 8, 8,
> 16, 16, 16, 16, 16, 16, 16, 16, 16, 20, 20, 20, 20, 20, 20, 20,
> 20, 20, 20, 20, 20, 7, 7, 7, 7, 18, 18, 18, 18, 18, 18, 15, 15,
> 15, 15, 15, 15, 15, 15, 6, 6, 6, 15, 15, 15, 15, 15, 15, 9, 9,
> 9, 9, 9, 9, 9, 4, 4, 3, 3, 3, 3, 4, 4, 4, 5, 8, 8, 3, 3, 3, 7,
> 7, 3, 3, 15, 15, 15, 15, 15, 15, 15, 15, 3, 3, 3, 4, 4, 4, 4,
> 8, 8, 8, 8, 4, 3), DepthNested = c(1, 1, 1, 2, 2, 2, 2, 1, 2,
> 4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 1, 1, 2, 2, 1, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1,
> 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 1), CA = c(1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 1), CE = c(2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0,
> 

Re: [R] ggplot pointrange from other df and missing legend

2022-05-16 Thread PIKAL Petr
Hallo Rui

Thanks. After I sent the first mail I noticed the missing tablePal.

palette("Tableau 10")
tablePal <- palette()

ggplot() +
   geom_col(
 data=test,
 aes(x=Sample, y=value, fill=SSAtype),
 position="dodge"
   ) +
   geom_pointrange(
 data=test.ag,
 aes(x=Sample, y=avg, ymin=avg-odch, ymax=avg+odch),
 size = 2
   ) +
   scale_fill_manual(
values=tablePal,
 name="Calculation\n performed\n according to",
 labels=c("bla", "blabla")
   ) +
   scale_shape_manual(name = "Measured", values = 19, labels = NULL)

You are correct that legend for bars is there. But the legend for points is 
missing. If I use only geom_point and values from the "test" data, also legend 
for the point appears. Is it possible that the second legend could be created 
only if all data are from one data frame?

Best regards
Petr

> -Original Message-
> From: Rui Barradas 
> Sent: Monday, May 16, 2022 1:53 PM
> To: PIKAL Petr ; r-help@r-project.org
> Subject: Re: [R] ggplot pointrange from other df and missing legend
> 
> Hello,
> 
> In the code below the legend is not missing, I couldn't reproduce that error.
> As for the size of the points, include argument size. I have size=2.
> 
> Also:
>   - tablePal is missing, I've chosen colors 2:3;
>   - I have commented out the guides(.), you are setting the shape after so it
> makes no difference.
> 
> 
> ggplot() +
>geom_col(
>  data=test,
>  aes(x=Sample, y=value, fill=SSAtype),
>  position="dodge"
>) +
>geom_pointrange(
>  data=test.ag,
>  aes(x=Sample, y=avg, ymin=avg-odch, ymax=avg+odch),
>  size = 2
>) +
>scale_fill_manual(
>  #values=tablePal,
>  values = 2:3,
>  name="Calculation\n performed\n according to",
>  labels=c("bla", "blabla")
>) +
>    #guides(fill = guide_legend(override.aes = list(shape = NA))) +
>scale_shape_manual(name = "Measured", values = 19, labels = NULL)
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
> Às 12:15 de 16/05/2022, PIKAL Petr escreveu:
> > Hallo all
> >
> >
> >
> > Here are the data from dput
> >
> >
> >
> > test <- structure(list(Sample = c("A", "A", "A", "A", "A", "A", "B",
> >
> > "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), SSAtype = c("one",
> >
> > "one", "one", "two", "two", "two", "one", "one", "one", "two",
> >
> > "two", "two", "one", "one", "one", "two", "two", "two"), value =
> > c(8.587645149,
> >
> > 8.743793651, 8.326440422, 9.255940687, 8.971931555, 8.856323865,
> >
> > 9.650809096, 9.725504448, 9.634449367, 9.69485369, 9.526758476,
> >
> > 10.03758001, 10.76845392, 10.66891602, 10.34894497, 10.76284989,
> >
> > 10.53074081, 11.16464528), SSAmeasuredP = c(8.3, 8.3, 8.3, 8.3,
> >
> > 8.3, 8.3, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 11, 11, 11, 11, 11, 11
> >
> > ), identity = c("point", "point", "point", "point", "point",
> >
> > "point", "point", "point", "point", "point", "point", "point",
> >
> > "point", "point", "point", "point", "point", "point")), row.names = c(NA,
> >
> > 18L), class = "data.frame")
> >
> >
> >
> > test.ag <- structure(list(Sample = c("A", "B", "C"), avg = c(8.3, 9.5, 11
> >
> > ), odch = c(0.2, 0.4, 0.3), identity = c("point", "point", "point"
> >
> > )), row.names = c(NA, 3L), class = "data.frame")
> >
> >
> >
> > I try to make some relatively simple barplot and I wanted to add points to
> > it which I somehow did.
> >
> >
> >
> > library(ggplot2)
> >
> >
> >
> > p <- ggplot(test, aes(x=Sample, y=value, fill=SSAtype))
> >
> > p+geom_col(position="dodge")+geom_point(aes(y=SSAmeasuredP,
> shape=identity),
> > size=5)+
> >
> > scale_fill_manual(values=tablePal, name="Calculation\n performed\n
> according
> > to",
> >
> >

[R] ggplot pointrange from other df and missing legend

2022-05-16 Thread PIKAL Petr
Hallo all

 

Here are the data from dput

 

test <- structure(list(Sample = c("A", "A", "A", "A", "A", "A", "B",

"B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), SSAtype = c("one",

"one", "one", "two", "two", "two", "one", "one", "one", "two",

"two", "two", "one", "one", "one", "two", "two", "two"), value =
c(8.587645149,

8.743793651, 8.326440422, 9.255940687, 8.971931555, 8.856323865,

9.650809096, 9.725504448, 9.634449367, 9.69485369, 9.526758476,

10.03758001, 10.76845392, 10.66891602, 10.34894497, 10.76284989,

10.53074081, 11.16464528), SSAmeasuredP = c(8.3, 8.3, 8.3, 8.3,

8.3, 8.3, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 11, 11, 11, 11, 11, 11

), identity = c("point", "point", "point", "point", "point",

"point", "point", "point", "point", "point", "point", "point",

"point", "point", "point", "point", "point", "point")), row.names = c(NA,

18L), class = "data.frame")

 

test.ag <- structure(list(Sample = c("A", "B", "C"), avg = c(8.3, 9.5, 11

), odch = c(0.2, 0.4, 0.3), identity = c("point", "point", "point"

)), row.names = c(NA, 3L), class = "data.frame")

 

I try to make some relatively simple barplot and I wanted to add points to
it which I somehow did.

 

library(ggplot2)

 

p <- ggplot(test, aes(x=Sample, y=value, fill=SSAtype))

p+geom_col(position="dodge")+geom_point(aes(y=SSAmeasuredP, shape=identity),
size=5)+

scale_fill_manual(values=tablePal, name="Calculation\n performed\n according
to",

labels=c("bla", "blabla"))+

guides(fill= guide_legend(override.aes = list(shape=NA)))+

scale_shape_manual(name = "Measured", values=19, labels=NULL)

 

But instead of points I want to use pointrange with data from other df. I
found some help and all is good except size of the points and missing legend

 

ggplot()+

geom_col(data=test, aes(x=Sample, y=value, fill=SSAtype), position="dodge")+

scale_fill_manual(values=tablePal, name="Calculation\n performed\n according
to",

labels=c("bla", "blabla"))+

geom_pointrange(data=test.ag, aes(x=Sample, y=avg, ymin=avg-odch,

ymax=avg+odch)) +

guides(fill= guide_legend(override.aes = list(shape=NA)))+

scale_shape_manual(name = "Measured", values=19, labels=NULL)

 

Although I will try to find some workable way I also would like to ask for
help from R gurus, maybe I overlooked some simple way how to do it.

 

Best regards

Petr

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] customizing edit.data.frame

2022-05-02 Thread PIKAL Petr
Hallo

I do not have much experience with Linux, Rstudio and ESS, but you can
customise R startup by .Rprofile.site .Rconsole files which are situated in
/etc directory of your installation

You can find some info about it in Rintro chapter
10.8 Customizing the environment

Cheers
Petr
> -Original Message-
> From: R-help  On Behalf Of Jeremie Juste
> Sent: Saturday, April 30, 2022 11:54 AM
> To: R-help@R-project.org
> Subject: [R] customizing edit.data.frame
> 
> Hello,
> 
> I was wondering how to customize the grid color of the GUI from the
> following command?
> 
> edit(data.frame())
> 
> The default grid color is red while on linux it is black. I also found out
that
> one can customize the mentioned color using the native R GUI in the menu
> preferences. After saving the preferences a file named Rconsole is
created.
> 
> I would like to know if there is a way to set the grid color directly when
> launching R from the terminal or RStudio or ESS?
> 
> Best regards,
> Jeremie
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Violin plots in R

2022-04-01 Thread PIKAL Petr
Hi.

Beside advice of others, at least 

str(yourdata)
and the ggplot code is minimum for us to be able to offer some advice.

Preferable way how to send your data is output from

dput(yourdata) or dput(head(yourdata))

Just copy paste to your email.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Ebert,Timothy
> Aaron
> Sent: Friday, April 1, 2022 11:35 AM
> To: pooja sinha ; r-help mailing list  project.org>
> Subject: Re: [R] Violin plots in R
> 
> No data. Attachments are removed. Send only text. Include program. Did you
> look at geom_violin() ? try http://www.sthda.com/english/wiki/ggplot2-
> violin-plot-quick-start-guide-r-software-and-data-visualization
> 
> If geom_point() is working then geom_violin() should also work. If neither
> work then the problem is earlier in the program. Possibly in the ggplot()
> statement, or earlier.
> Tim
> 
> -Original Message-
> From: R-help  On Behalf Of pooja sinha
> Sent: Thursday, March 31, 2022 7:00 PM
> To: r-help mailing list 
> Subject: [R] Violin plots in R
> 
> [External Email]
> 
> Hi All,
> 
> I need your help in making the violin plot in R using the data which is
> attached herewith. I am new to R and having issues in tidying my data for
R. I
> am trying the code but I am not able to tidy my data for violin plot in
ggplot.
> 
> Any help will be highly appreciated.
> 
> Thanks,
> Puja
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp=DwICAg=sJ6xIWYx-
> zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=ewsSBA7yD-
> EGNgEo5uwx8ypGde2s8cN0NqRJcjUeat_oxJOzxc4u1RoS4APbTp-G=5RT-
> EbmPWAiqxd-tCyTCK3tUPOmNomYvRoE-pgEKXMk=
> PLEASE do read the posting guide
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-
> 2Dproject.org_posting-2Dguide.html=DwICAg=sJ6xIWYx-
> zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=ewsSBA7yD-
> EGNgEo5uwx8ypGde2s8cN0NqRJcjUeat_oxJOzxc4u1RoS4APbTp-
> G=FhxhDOcldxc5mIg1wiE5kq4D_sxIm1Ho0PbejAHl8xY=
> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pcalg library : estimated DAG graph

2022-03-31 Thread PIKAL Petr
Hallo

It is not obvious what is your problem. You maybe should go through some 
articles about functions in the package.
https://pdfs.semanticscholar.org/aee5/cca63aad422c96ee637b5561bb051724d76c.pdf?_ga=2.241710466.1872226166.1648790330-1600227915.1617009045

or maybe you do not have Rgraphviz package installed as required by plotting 
functions of the package.

R> stopifnot(require(Rgraphviz))# needed for all our graph plots

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of varin sacha via
> R-help
> Sent: Friday, April 1, 2022 12:09 AM
> To: r-help@r-project.org
> Subject: [R] pcalg library : estimated DAG graph
>
> Dear R-experts,
>
> Here below my R code working but I don't know how I can get the graph
> (estimated DAG).
> If you could help me to get the graph, many thanks.
>
>
> ###
> library(pcalg)
>
> x1<-
> c(508,413,426,500,568,372,484,512,529,322,544,586,480,561,567,488,450,
> 548,526,561,435,567,537,521,516,407,531,374,406,595,460,420,453,562,53
> 0)
>
> x2<-
> c(531,491,353,522,341,493,431,565,530,441,403,498,552,513,513,403,445,
> 424,529,486,519,492,397,579,479,511,535,504,465,520,517,528,542,483,49
> 9)
>
> x3<-
> c(541,451,510,649,329,464,511,609,643,530,568,366,371,442,611,437,445,
> 589,605,456,437,179,540,580,587,540,505,310,542,488,525,483,200,517,51
> 3)
>
> x4<-
> c(488,473,449,564,447,593,420,685,597,534,608,389,557,385,564,449,530,
> 615,502,510,412,321,509,480,469,594,506,431,555,567,491,414,359,418,46
> 8)
>
> x5<-
> c(487,430,419,583,469,369,540,637,563,328,498,448,356,552,521,417,513,
> 570,530,594,372,537,469,454,554,518,550,384,533,594,467,471,590,552,55
> 6)
>
> x6<-
> c(511,452,432,563,431,458,477,601,572,431,524,458,463,491,555,439,477,
> 549,539,521,435,419,490,523,521,514,525,401,500,553,492,463,429,506,51
> 3)
>
> X=cbind(x1,x2,x3,x4,x5,x6)
>
> res1=lingam(X,verbose=TRUE)
> cat("estimated DAG:/n")
> as(res1,"amat")
> res1
> ###
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA's in segmented

2022-03-10 Thread PIKAL Petr
Hi

1. Do not use HTML formated mail, your message is scrambled
2. With your code only you can get result all of us get error

> Mod = lm(as.numeric(data[,"meanMSD25"])~(as.numeric(data[,"Slice"])), 
> weights=1*sqrt(as.numeric(data[,"Detec"])))
Error in data[, "meanMSD25"] : 
  object of type 'closure' is not subsettable
Try to provide reproducible example.

3. with lm it is not necessary to use data[,"meanMSD25"], you should provide 
your data object to data option in lm call (it is silly to call your data data 
- fortune("dog") applies here)
4. why do you use as.numeric(data[,"meanMSD25"]), shouldn't the column be 
numeric itself?

So my  wild guess is that you read the data into R wrong way and instead of 
numeric they are character, which could be one source of your problem.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Mélina Cointe
> Sent: Tuesday, March 8, 2022 9:45 PM
> To: r-help@r-project.org
> Subject: Re: [R] NA's in segmented
> 
> Hi,
> 
> I�m contacting you because I have some trouble with the slope() function of
> segmented. When I plot the predicted value everything seems to have worked
> well. But when I use slope(), the slope for the first segment is a line with 0
> and Nas� Also I can get a negative slope whereas on the graph it�s clearly a
> positive slope�
> 
> Here are the lines I�m using :
> 
> Mod = lm(as.numeric(data[,"meanMSD25"])~(as.numeric(data[,"Slice"])),
> weights=1*sqrt(as.numeric(data[,"Detec"])))
>   x = (as.numeric(data[,"Slice"]))
>   o <- segmented(Mod, seg.Z=~x, psi=NA, control=seg.control(display=FALSE,
> K=2))
> 
> Thanks in advance for your help,
> 
> Best regards,
> 
> M�lina COINTE
> 
> 
> 
>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question About lm()

2022-02-09 Thread PIKAL Petr
Hi

Is it enough for explanation?

https://stats.stackexchange.com/questions/26176/removal-of-statistically-sig
nificant-intercept-term-increases-r2-in-linear-mo

https://stackoverflow.com/questions/57415793/r-squared-in-lm-for-zero-interc
ept-model

Cheers
Petr
> -Original Message-
> From: R-help  On Behalf Of Bromaghin,
Jeffrey
> F via R-help
> Sent: Wednesday, February 9, 2022 11:01 PM
> To: r-help@r-project.org
> Subject: [R] Question About lm()
> 
> Hello,
> 
> I was constructing a simple linear model with one categorical (3-levels)
and one
> quantitative predictor variable for a colleague. I estimated model
parameters
> with and without an intercept, sometimes called reference cell coding and
cell
> means coding.
> 
> Model 1: yResp ~ -1 + xCat + xCont
> Model 2: yResp ~ xCat + xCont
> 
> These models are equivalent and the estimated coefficients come out fine,
but
> the R-squared and F statistics returned by summary() differ markedly. I
spent
> some time looking at the code for both lm() and summary.lm() but did not
find
> the source of the difference. aov() and anova() results also differ, so I
suspect
> the issue involves how the sums of squares are being computed. I've also
spent
> some time trying to search online for information on this, without
success. I
> haven't used lm() for quite a while, but my memory is that these
differences
> didn't occur in the distant past when I was teaching.
> 
> Thanks in advance for any insights you might have, Jeff
> 
> Jeffrey F. Bromaghin
> Research Statistician
> USGS Alaska Science Center
> 907-786-7086
> Jeffrey Bromaghin, Ph.D. | U.S. Geological Survey
> (usgs.gov)
> Ecosystems Analytics | U.S. Geological Survey
> (usgs.gov) center/science/ecosystems-analytics>
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

2022-01-26 Thread PIKAL Petr
Hi

Actually you did not. Your original question was:

> Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed
> I used this:
> var <- ifelse(test$operator == 'T14', 1, 0)
> operator has several values like T1, T3, T7, T15, T31, T37
> For some values like T3, T7 it works fine but for majority of values
> it gives error.
> When I use: is.na(ts$operator), it shows all false values so no NAs.

Only now we could inspect your whole code and it was already pointed that the 
error does not originate from ifelse.

With the same data and ifelse code I did not get any error.

test <- structure(list(DepthTree = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,


> str(test)
'data.frame':   146 obs. of  15 variables:
 $ DepthTree: num  1 1 1 1 1 1 1 1 1 1 ...
 
$ numCovered   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ operator : Factor w/ 16 levels "T0","T1","T2",..: 4 4 7 8 8 8 11 4 10 7 
...
 $ methodReturn : Factor w/ 22 levels "I","V","Z","method",..: 2 2 2 2 2 2 2 4 
4 2 ...
 $ numTestsCover: num  16 15 15 16 15 15 15 4 4 16 ...
 $ mutantAssert : num  55 55 55 55 55 55 55 13 13 55 ...
 $ classAssert  : num  3 3 3 3 3 3 3 3 3 3 ...
 $ isKilled : Factor w/ 2 levels "yes","no": 2 2 2 2 2 2 2 2 2 2 ...
>
prot <- ifelse(test$operator == 'T13', 1, 0)

the most probable source of the error is

fc= fairness_check(explainer,
  protected = prot,
   privileged = privileged)

so you should check explainer and privileged

Cheers
Petr

From: javed khan 
Sent: Wednesday, January 26, 2022 3:45 PM
To: PIKAL Petr 
Cc: R-help 
Subject: Re: [R] Error in if (fraction <= 1) { : missing value where 
TRUE/FALSE needed

Hi Pikal, why would I hide something? I provided just a code where error is.

Full code is:

index= sample(1:nrow(data), 0.7*nrow(data))
train= data[index,]
test= data[-index,]


task = TaskClassif$new("data", backend = train, target = "isKilled")

learner= lrn("classif.gbm", predict_type = "prob")

model= learner$train(task )

explainer = explain_mlr3(model,
 data = test[,-15],
 y = as.numeric(test$isKilled)-1,
 label="GBM")

prot <- ifelse(test$operator == 'T13', 1, 0)
privileged <- '1'

fc= fairness_check(explainer,
  protected = prot,
   privileged = privileged)
plot(fc)


And my data is the following:

str(test)
'data.frame': 146 obs. of  15 variables:
 $ DepthTree: num  1 1 1 1 1 1 1 1 1 1 ...
 $ NumSubclass  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ McCabe   : num  1 3 3 3 3 3 3 1 1 2 ...
 $ LOC  : num  3 10 10 10 10 10 10 4 4 5 ...
 $ DepthNested  : num  1 2 2 2 2 2 2 1 1 2 ...
 $ CA   : num  1 1 1 1 1 1 1 1 1 1 ...
 $ CE   : num  2 2 2 2 2 2 2 2 2 2 ...
 $ Instability  : num  0.667 0.667 0.667 0.667 0.667 0.667 0.667 0.667 0.667 
0.667 ...
 $ numCovered   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ operator : Factor w/ 16 levels "T0","T1","T2",..: 4 4 7 8 8 8 11 4 10 7 
...
 $ methodReturn : Factor w/ 22 levels "I","V","Z","method",..: 2 2 2 2 2 2 2 4 
4 2 ...
 $ numTestsCover: num  16 15 15 16 15 15 15 4 4 16 ...
 $ mutantAssert : num  55 55 55 55 55 55 55 13 13 55 ...
 $ classAssert  : num  3 3 3 3 3 3 3 3 3 3 ...
 $ isKilled : Factor w/ 2 levels "yes","no": 2 2 2 2 2 2 2 2 2 2 ...
> dput(test)
structure(list(DepthTree = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 2, 2, 1, 2, 1, 1, 1), NumSubclass = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2), McCabe = c(1, 3, 3,
3, 3, 3, 3, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 1, 1, 1, 2,
2, 2, 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 2, 2, 2,
2, 5, 5, 5, 5, 5, 5, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 2, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 1,

Re: [R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

2022-01-26 Thread PIKAL Petr
Hi

It seems that you are hiding what you really do. 

This
> options(error = NULL)
works fine without any error. So please If you want some reasonable answer
post question with data and code which is causing the error.

My wild guess is that you have some objects in your environment and you do
not know that they are used in you commands. Try to start fresh R session
and try to inspect your environment with 

ls()

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of javed khan
> Sent: Wednesday, January 26, 2022 3:05 PM
> To: Ivan Krylov 
> Cc: R-help 
> Subject: Re: [R] Error in if (fraction <= 1) { : missing value where
TRUE/FALSE
> needed
> 
> Ivan, thanks
> 
> When I use options(error = NULL)
> 
> it says: Error during wrapup: missing value where TRUE/FALSE needed
> Error: no more error handlers available (recursive errors?); invoking
'abort'
> restart
> 
> With traceback(), I get
> 
> 4: readable_number(max_value - min_value, FALSE)
> 3: get_nice_ticks(lower_bound, upper_bound)
> 
> On Wed, Jan 26, 2022 at 2:53 PM Ivan Krylov  wrote:
> 
> > On Wed, 26 Jan 2022 14:47:16 +0100
> > javed khan  wrote:
> >
> > > Error in if (fraction <= 1) { : missing value where TRUE/FALSE
> > > needed
> >
> > > var <- ifelse(test$operator == 'T14', 1, 0)
> >
> > The error must be in a place different from your test$operator
> > comparison. Have you tried traceback() to get the call stack leading
> > to the error? Or options(error = recover) to land in a debugger
> > session the moment an uncaught error happens? (Use options(error =
> > NULL) to go back to the default behaviour.)
> >
> > Unrelated: var <- test$operator == 'T14' will also give you an
> > equivalent logical vector with a bit less work.
> >
> > --
> > Best regards,
> > Ivan
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

2022-01-26 Thread PIKAL Petr
Hi

Do not post in HTML, please.
Try to show your real data - use str(test), or preferably dput(test). If
test is big, use only fraction of it
The problem must be probably in your data.

x <- sample(1:20, 100, replace=T)
fake <- paste("T", x, sep="")
ifelse(fake=="T14", 1,0) 
  [1] 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0
 [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0
 [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
 
head(fake)
[1] "T7"  "T9"  "T3"  "T9"  "T12" "T9" 
> str(fake)
 chr [1:100] "T7" "T9" "T3" "T9" "T12" "T9" "T19" "T19" "T12" "T2" "T17" ...
>
Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of javed khan
> Sent: Wednesday, January 26, 2022 2:47 PM
> To: R-help 
> Subject: [R] Error in if (fraction <= 1) { : missing value where
TRUE/FALSE
> needed
> 
> I get this error:
> 
> Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed
> 
> I used this:
> 
> var <- ifelse(test$operator == 'T14', 1, 0)
> 
> operator has several values like T1, T3, T7, T15, T31, T37
> 
> For some values like T3, T7 it works fine but for majority of values it
gives error.
> 
> When I use: is.na(ts$operator), it shows all false values so no NAs.
> 
> Where could be the problem?
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function problem: multi selection in one argument

2022-01-25 Thread PIKAL Petr
Hallo

You should explain better what do you want as many people here do not use 
tidyverse functions.

I am not sure what the function should do.
table(iris$Species)
give the same result and whatever you do in tidyverse part it always finalise 
in table(...$Species)

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Kai Yang via R-help
> Sent: Tuesday, January 25, 2022 1:14 AM
> To: R-help Mailing List 
> Subject: [R] function problem: multi selection in one argument
>
> Hello Team,
> I can run the function below:
>
> library(tidyverse)
>
> f2 <- function(indata, subgrp1){
>   indata0 <- indata
>   temp<- indata0 %>% select({{subgrp1}}) %>% arrange({{subgrp1}}) %>%
> group_by({{subgrp1}}) %>%
> mutate(numbering =row_number(), max=max(numbering))
>   view(temp)
>   f_table <- table(temp$Species)
>   view(f_table)
>   return(f_table)
> }
> f2(iris, Species)
>
> You can see the second argument I use Species only, and it works fine. But 
> If I
> say, I want the 2nd argument = Petal.Width, Species , how should I write the
> argument? I did try f2(iris, c(Petal.Width, Species)), but I got error 
> message:
> Error: arrange() failed at implicit mutate() step.
> * Problem with `mutate()` column `..1`.
> i `..1 = c(Petal.Width, Species)`.
> i `..1` must be size 150 or 1, not 300.
>
> I'm not sure how to fix the problem either in function or can fix it when 
> using the
> function.
> Thank you,
> Kai
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (Off-Topic) Time for a companion mailing list for R packages?

2022-01-13 Thread PIKAL Petr
Hallo all

I do not consider answers here unresponsive or unfriendly. Most answers point 
to the way how to procced and solve the problems. Although RTFM is sometimes 
the best what anybody can do (and I did it myself around 1997 when I started 
with R). Hardly anybody here is flamed when he/she asks simple questions.

To create some other list where questions about broad range of available 
packages should be answered is, IMHO, wrong way forward. You need to have some 
critical mass of answering people in list to be worth to attend.

You state
>it can be very hard for novice users to get the help they need.<
but can we prove it? How many unanswered questions are in a year? How many 
responses are "this is off topic" as the only response in thread? How many 
responses are "contact maintainer" as the only response in thread? An most 
importantly are these numbers increasing?

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Kevin Thorpe
> Sent: Thursday, January 13, 2022 1:45 PM
> To: Jeff Newmiller 
> Cc: R Help Mailing List 
> Subject: Re: [R] (Off-Topic) Time for a companion mailing list for R packages?
> 
> This is an interesting issue and something I have been thinking about raising 
> with
> my fellow volunteer moderators.
> 
> I honestly don’t know what the best solution is. Personally, I would loathe
> having to check multiple web-forums/mailing lists to find an answer. New users
> often do not appreciate the subtleties (i.e. RStudio is not R) and will 
> continue to
> post here. The frequent reply to questions outside base R that inform them 
> they
> are off-topic could come across as unfriendly. That could have the side 
> effect of
> making the community appear elitist. Folks are also often referred to package
> maintainers but not all maintainers are equally responsive to queries about 
> their
> packages. In summary, it can be very hard for novice users to get the help 
> they
> need.
> 
> I appreciate the desire of many to keep the focus of this list narrow, yet 
> despite
> the narrow mandate there are many readers who can answer non-base R
> questions, which is probably one of the reasons we see the questions. I wonder
> if there would be an appetite to create a new list, R-package-help, that has a
> broad mandate (as suggested by Avi). Naturally there is no guarantee that
> specific questions about some esoteric package will be answered, but that’s a
> different problem. On the other hand, why not expand the mandate of R-help
> rather than going to the trouble of creating a new list? Like I said, I don’t 
> know.
> 
> Thanks for raising the issue.
> 
> Kevin
> 
> 
> > On Jan 12, 2022, at 11:24 PM, Jeff Newmiller 
> wrote:
> >
> > TL;DR The people responsible for tidyverse don't think much of mailing 
> > lists.
> >
> > IANAMLA (I am not mailing list admin) and I know some people get kind of
> heated about these things, but my take is that this list _is_ about R so to 
> be on
> topic the question needs to be about R and how to get things done in R. Since
> contributed packages are almost by definition creating capabilities linked 
> with
> specific problem domains or domain-specific-languages (DSLs), and there are
> thousands of these, it isn't practical to support questions framed within 
> those
> DSLs here. It seems perfectly legitimate IMHO to mention such packages here, 
> as
> long as the question does not hinge on that package, and even to offer small
> solutions to posed R problems using such packages. Others may disagree with
> my perspective on this. Unfortunately all of this this subtlety is usually 
> lost upon
> newbies, much to the detriment of this list's reputation.
> >
> > The responsibility to setup and manage support for contributed packages
> belongs to the package maintainer. In the case of tidyverse, the general 
> opinion
> of those people seems to be that web forums avoid the "only unformatted info
> can be shared" nature of traditional mailing lists, so mailing lists have 
> AFAIK not
> been built or tended.
> >
> > Unfortunately, they also try to "allow all topics" as much as possible in 
> > those
> forums to minimize the appearance of unfriendliness to beginners, but my
> impression is that this leads to such a wide range of topics that many posts 
> don't
> get answered. I have certainly found it to be just too much quantity to sift
> through, and I really am selective about which portions of the tidyverse I 
> work
> with anyway, so I don't hang out there much at all.
> >
> > On January 12, 2022 7:27:20 PM PST, Avi Gross via R-help  project.org> wrote:
> >> Respectfully, this forum gets lots of questions that include non-base R
> components and especially packages in the tidyverse. Like it or not, the
> extended R language is far more useful and interesting for many people and
> especially those who do not wish to constantly reinvent the wheel.
> >> And repeatedly, we get people reminding (and sometimes chiding) others for
> daring to 

Re: [R] ggtree node labels

2022-01-09 Thread PIKAL Petr
Hi

If you asked google ggtree node label you will get many hits which tell you
how to label nodes

E.g.
https://bioc.ism.ac.jp/packages/3.3/bioc/vignettes/ggtree/inst/doc/treeAnnot
ation.html

And you will get it much more quickly than by asking here.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of April Ettington
> Sent: Monday, January 10, 2022 1:27 AM
> To: r-help@r-project.org
> Subject: [R] ggtree node labels
> 
> Hello,
> 
> Is there a way to add nodelabels to a ggtree plot in R?  Thanks in advance
> :)
> 
> April
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected (?) behavior of box()

2022-01-07 Thread PIKAL Petr
Hi Ani

No need to apologise. I may by wrong as I did not dig into the map code and do 
not use the package so I only guess. You could check how mar changes by

> par("mar")
[1] 5.1 4.1 4.1 2.1

So
par("mar")
par(mar=c(2, 6, 5, 4))
par("mar")
m<-map('world', xlim = c(91, 142), ylim = c(25, 40), lwd=1.5, col = 
"grey",border=NA, fill = T,  bg="white")
par("mar")
box()

should give you info how mar looks like before each step and which one is to 
blame.

Jim gave you also some possible workaround.

Cheers
Petr

> -Original Message-
> From: ani jaya 
> Sent: Friday, January 7, 2022 9:12 AM
> To: PIKAL Petr ; r-help 
> Subject: Re: [R] unexpected (?) behavior of box()
>
> Hi Petr,
>
> Thank you for pointing that out! Silly newbie here.
> So, just want to make sure my mind,
> using my example:
>
> par(mar=c(2, 6, 5, 4))
> m<-map('world', xlim = c(91, 142), ylim = c(25, 40),
>lwd=1.5, col = "grey",border=NA, fill = T,  bg="white")
> box()
>
> first, map use the first mar=c(2,6,5,4), and then defines the new mar that 
> is
> mar=c(4.1, 4.1, par("mar")[3], 0.1)=c(4.1, 4.1, 5, 0.1).
> And then box using the new mar=c(4.1, 4.1, 5, 0.1). Is that right?
>
> I am sorry if out of topic. Maybe further I will post at r-sig-geo. Thank 
> you.
>
> On Fri, Jan 7, 2022 at 4:39 PM PIKAL Petr  wrote:
> >
> > Hi.
> >
> > Why do you consider it unexpected?
> >
> > see
> >
> > map(database = "world", regions = ".", exact = FALSE, boundary = TRUE,
> >   interior = TRUE, projection = "", parameters = NULL, orientation = NULL,
> >   fill = FALSE, col = 1, plot = TRUE, add = FALSE, namesonly = FALSE,
> >   xlim = NULL, ylim = NULL, wrap = FALSE, resolution = if (plot) 1 else 0,
> >   type = "l", bg = par("bg"), mar = c(4.1, 4.1, par("mar")[3], 0.1),
> >   myborder = 0.01, namefield="name", lforce="n", ...)
> >
> > map function redefines mar so your first par is probably changed
> > during plotting map and after you define it again box use new mar values.
> >
> > Cheers
> > Petr
> >
> > > -Original Message-
> > > From: R-help  On Behalf Of ani jaya
> > > Sent: Friday, January 7, 2022 8:25 AM
> > > To: r-help 
> > > Subject: [R] unexpected (?) behavior of box()
> > >
> > > Dear R expert,
> > >
> > > I try to box a figure using box(). However it box the default
> > > margin, not the specified margin.
> > >
> > > #working as expected
> > > barplot(1:20)
> > > box()
> > >
> > > #working as expected, the box follow the margin par(mar=c(2, 6, 5,
> > > 4))
> > > barplot(1:20)
> > > box()
> > >
> > > #not working
> > > install.packages("maps")
> > > library(maps)
> > > par(mar=c(2, 6, 5, 4))
> > > m<-map('world', xlim = c(91, 142), ylim = c(25, 40),
> > >lwd=1.5, col = "grey",border=NA, fill = T,  bg="white")
> > > box()
> > >
> > > #the turnaround
> > > par(mar=c(2, 6, 5, 4))
> > > m<-map('world', xlim = c(91, 142), ylim = c(25, 40),
> > >lwd=1.5, col = "grey",border=NA, fill = T,  bg="white")
> > > par(mar=c(2, 6, 5, 4))
> > > box()
> > >
> > > I just curious with this behavior. Is it the problem with the
> > > package "map" or box() function?
> > > Thank you.
> > >
> > > > sessionInfo()
> > > R version 4.0.2 (2020-06-22)
> > > Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 10
> > > x64 (build 19043)
> > >
> > >
> > >
> > > Ani
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected (?) behavior of box()

2022-01-06 Thread PIKAL Petr
Hi.

Why do you consider it unexpected?

see

map(database = "world", regions = ".", exact = FALSE, boundary = TRUE,
  interior = TRUE, projection = "", parameters = NULL, orientation = NULL,
  fill = FALSE, col = 1, plot = TRUE, add = FALSE, namesonly = FALSE,
  xlim = NULL, ylim = NULL, wrap = FALSE, resolution = if (plot) 1 else 0,
  type = "l", bg = par("bg"), mar = c(4.1, 4.1, par("mar")[3], 0.1),
  myborder = 0.01, namefield="name", lforce="n", ...)

map function redefines mar so your first par is probably changed during
plotting map and after you define it again box use new mar values.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of ani jaya
> Sent: Friday, January 7, 2022 8:25 AM
> To: r-help 
> Subject: [R] unexpected (?) behavior of box()
> 
> Dear R expert,
> 
> I try to box a figure using box(). However it box the default margin,
> not the specified margin.
> 
> #working as expected
> barplot(1:20)
> box()
> 
> #working as expected, the box follow the margin
> par(mar=c(2, 6, 5, 4))
> barplot(1:20)
> box()
> 
> #not working
> install.packages("maps")
> library(maps)
> par(mar=c(2, 6, 5, 4))
> m<-map('world', xlim = c(91, 142), ylim = c(25, 40),
>lwd=1.5, col = "grey",border=NA, fill = T,  bg="white")
> box()
> 
> #the turnaround
> par(mar=c(2, 6, 5, 4))
> m<-map('world', xlim = c(91, 142), ylim = c(25, 40),
>lwd=1.5, col = "grey",border=NA, fill = T,  bg="white")
> par(mar=c(2, 6, 5, 4))
> box()
> 
> I just curious with this behavior. Is it the problem with the package
> "map" or box() function?
> Thank you.
> 
> > sessionInfo()
> R version 4.0.2 (2020-06-22)
> Platform: i386-w64-mingw32/i386 (32-bit)
> Running under: Windows 10 x64 (build 19043)
> 
> 
> 
> Ani
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting data matrix into submatrices

2022-01-05 Thread PIKAL Petr
Hi

As Jeff said for data frames you can use split.
If you insist to work with matrices, the principle is similar but you cannot 
use split.

mat <- matrix(1:20, 5,4)
mat
 [,1] [,2] [,3] [,4]
[1,]16   11   16
[2,]27   12   17
[3,]38   13   18
[4,]49   14   19
[5,]5   10   15   20

# factor for splitting
fac <- factor(c(1,2,1,2,1))
> fac
[1] 1 2 1 2 1
Levels: 1 2

#splitting matrix according to the factor levels
> mat[which(fac==1), ]
 [,1] [,2] [,3] [,4]
[1,]16   11   16
[2,]38   13   18
[3,]5   10   15   20
> mat[which(fac==2), ]
 [,1] [,2] [,3] [,4]
[1,]27   12   17
[2,]49   14   19

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Jeff Newmiller
> Sent: Wednesday, January 5, 2022 8:57 AM
> To: Faheem Jan ; R-help 
> Subject: Re: [R] splitting data matrix into submatrices
>
> Please reply all so the mailing list is included in the discussion. I don't 
> do 1:1
> tutoring and others can chime in if I make a mistake.
>
> I would say you don't understand what my example did, since it doesn't care
> how many columns are in your data frame. If you are in fact working with a
> matrix, then convert it to a data frame.
>
> As for continuing to help you... you need to provide a minimal reproducible
> example with a small sample data set if what I showed you isn't helping.
>
> [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-
> reproducible-example
>
> [2] http://adv-r.had.co.nz/Reproducibility.html
>
> On January 4, 2022 10:44:39 PM PST, Faheem Jan
>  wrote:
> >I understand what you have done, it can easily apply in the case of vector
> information mean each day having a single observation. But in my case I have
> 24 observations each day, I want to convert the matrix into two submatrices
> for weekdays and the other for weekends. So please suggest to me anyway
> that how I do this?
> >
> >On Wednesday, January 5, 2022, 11:10:34 AM GMT+5, Jeff Newmiller
>  wrote:
> >
> > A lot of new R users fail to grasp what makes data frames more useful than
> matrices, or use data frames without even realizing they are not using
> matrices.
> >
> >This is important because there are more tools for manipulating data frames
> than matrices. One tool is the split function... if you have a vector of 
> values
> identifying how each row should be identified you can give that to the split
> function with your data frame and it will return a list of data frames (2 in 
> this
> case).
> >
> >v <- rep( 0:6, length=1826 )
> >wkv <- ifelse( v < 5, "Weekday", "Weekend" ) ans <- split( DF, wkv )
> >ans$Weekday ans$Weekend
> >
> >Note that this is a fragile technique for generating wkv though... usually
> there will be a column of dates that can be used to generate wkv more
> consistently if your data changes.
> >
> >Please read the Posting Guide... using formatted email can cause readers to
> not see what you sent. Use plain text email format... it is a setting in 
> your
> email client.
> >
> >On January 4, 2022 7:52:34 PM PST, Faheem Jan via R-help  project.org> wrote:
> >>I have data in a matrix form of order 1826*24 where 1826 represents the
> days and 24 hourly observations on each data. My objective is to split the
> matrix into working (Monday to Friday) and non-working (Saturday and
> Sunday) submatrices. Can anyone help me that how I will do that splitting
> using R?
> >>
> >>
> >>[[alternative HTML version deleted]]
> >>
> >>__
> >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to run r biotools boxM terst on multiple groups?

2022-01-04 Thread PIKAL Petr
Hi.

Not sure if statistically correct but what about

iris$int<- interaction(iris$bin, iris$Species)
boxM(iris[,1:4], iris[,7])

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Luigi Marongiu
> Sent: Tuesday, January 4, 2022 11:56 AM
> To: r-help 
> Subject: [R] how to run r biotools boxM terst on multiple groups?
> 
> I have a data frame containing a half dozen continuous measurements and
> over a dozen ordinal variables (such as, death, fever, symptoms etc).
> I would like to run a box matrix test and I am using biotools' boxM, but
it
> allows to run only one ordinal group at the time. For instance:
> ```
> >data(iris)
> >boxM(iris[,1:4], iris[,5])
> 
> Box's M-test for Homogeneity of Covariance Matrices
> 
> data:  iris[, 1:4]
> Chi-Sq (approx.) = 140.94, df = 20, p-value < 2.2e-16
> 
> >bins <- c(1,2); iris$bin <- findInterval(iris$Petal.Width, bins)
> >iris$bin = factor(iris$bin) boxM(iris[,1:4], iris[,6])
> 
> Box's M-test for Homogeneity of Covariance Matrices
> 
> data:  iris[, 1:4]
> Chi-Sq (approx.) = 140.94, df = 20, p-value < 2.2e-16
> 
> >boxM(iris[,1:4], iris[,5:6])
> Error in boxM(iris[, 1:4], iris[, 5:6]) : incompatible dimensions!
> ```
> Is there a way to check for equality of variance-covariance on multiple
groups
> simultaneously?
> Thanks
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple Data Import Excel

2022-01-04 Thread PIKAL Petr
Hi

Try readxl package which has possibility to limit range of cells for
reading.

https://cran.r-project.org/web/packages/readxl/readxl.pdf

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Bradley Heins via
> R-help
> Sent: Tuesday, January 4, 2022 4:20 AM
> To: r-help@r-project.org
> Subject: [R] Multiple Data Import Excel
> 
> Hello,
> I have 2100 Excel files (.xlsx) that I need to read and combine into 1
file.
> I am perplexed because the first 6 lines are header information and the
8th
> line are the columns that are needed with the data in columns.
> 
> I need to save the first last (IceTag ID) because that number becomes the
ID
> for all of the data in each specific Excel file.  The ID can be different
for
> spreadsheets.
> 
> Line 2 to 7 are not needed.
> Line 8 are the column headers.
> 
> The columns are Date, Time, Motion Standing (in time format), Lying (In
time
> format), Steps and bouts.  See example below.
> 
> Any help in reading in multiple files and discarding some lines would be
> appreciated.
> Regards,
> Brad
> 
> 
> 
> IceTag ID: 61409782
> Site ID: n/a
> Animal ID: n/a
> First Record: 05/18/2021 14:04:27
> Last Record: 05/25/2021 14:00:51
> File Time Zone: Central Standard Time
> 
> Date TimeMotion StandingT LyingT Steps Bouts
> 05/18/2021 14:04:27  65  0:10:29 0:00:04 20 1
> 05/18/2021 14:15:00 69 0:08:52 0:06:08 15 1
> 
> 
> --
> Bradley J. Heins
> Extension Specialist, Dairy Management | Extension | extension.umn.edu
> Associate Professor, Dairy Management | West Central ROC |
> wcroc.cfans.umn.edu University of Minnesota | umn.edu
> 46352 State Hwy 329, Morris, MN 56267
> hein0...@umn.edu | 320-589-1711, Ext. 2118
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mixture univariate distributions fit

2022-01-03 Thread PIKAL Petr
Hallo Bert

The discussion starts to be more off topic here, as you already pointed. There 
probably is not any package (function) in R designed for easy overlapping peak 
(distribution) fitting. With original data one could use mixtools, with 
density or cummulative density values Ivan's suggestion seems to work 
reasonably.

To your questions
1. No, peak location is not known. If I decided to code the function (package) 
myself I would start with plot and user should select possible location by 
locator.

2. No, but one should restrict the number of components to some reasonable 
value.

3. In particle size measurement it is usually lognormal or normal 
distribution, for which the way suggested by Ivan is workable solution. 
However in the other case I have on mind, the function could be more variable 
(Fraser-Suzuki, Cauchy, Pseudo-Voigt, ...) and I would need to program the 
curves myself. Possible way is to make a plot with the starting values and let 
user to change them until the fit is relatively close to measured values.

So unless somebody could point me to an R package for such peak shape mixture 
evaluation I do not consider further discussion necessary. I first need to do 
my homework if I decided to code such function myself.

Thank you again and best regards.
Petr

> -Original Message-
> From: Bert Gunter 
> Sent: Friday, December 31, 2021 6:57 PM
> To: PIKAL Petr 
> Cc: Ivan Krylov ; r-help mailing list  project.org>
> Subject: Re: [R] mixture univariate distributions fit
>
> Petr:
> Please feel free to ignore and not reply if you think the following 
> questions
> are unhelpful.
>
> 1. Do you want to know the location of peaks (local modes) or the
> parameters of the/a mixture distribution? Peaks do not have to be located at
> the modes of the individual components of the mixture.
>
> 2. Do you know the number of components in the mixture? This would
> simplify the problem (a lot, I believe; though those more knowledgeable
> should comment on that).
>
> 3. Do you know that the points on the fitted density you get are obtained as
> a mixture of normals? Or  at least of symmetric distributions? ... or 
> whether
> they are obtained by some sort of
> (algorithmic) density estimation procedure?
>
> Best and New Year's greeting to all,
> Bert
>
>
>
> On Fri, Dec 31, 2021 at 1:49 AM PIKAL Petr  wrote:
> >
> > Hallo Ivan
> >
> > Thanks. Yes, this approach seems to be viable. I did not consider
> > using dnorm in fitting procedure. But as you pointed
> >
> > > (Some nonlinear least squares problems will be much harder to solve
> > > though.)
> >
> > This simple example is quite easy. The more messy are data and the
> > more distributions are mixed in them the more problematic could be the
> > correct starting values selection. Errors could be quite common.
> >
> > x <- (0:200)/100
> > y1 <- dnorm(x, mean=.3, sd=.1)
> > y2 <- dnorm(x, mean=.7, sd=.2)
> > y3 <- dnorm(x, mean=.5, sd=.1)
> >
> > ymix <- ((y1+2*y2+y3)/max(y1+2*y2+y3))+rnorm(201, sd=.001) plot(x,
> > ymix)
> >
> > With just sd1 and sd2 slightly higher, the fit results to error.
> > > fit <- minpack.lm::nlsLM(
> > +  ymix ~ a1 * dnorm(x, mu1, sd1) + a2 * dnorm(x, mu2, sd2)+
> > +  a3 * dnorm(x, mu3, sd3),
> > +  start = c(a1 = 1, mu1 = .3, sd1=.3, a2 = 2, mu2 = .7, sd2 =.3,
> > +  a3 = 1, mu3 = .5, sd3 = .1),
> > +  lower = rep(0, 9) # help minpack avoid NaNs
> > + )
> > Error in nlsModel(formula, mf, start, wts) :
> >   singular gradient matrix at initial parameter estimates
> >
> > If sd1 and sd2 are set to lower value, the function is no longer
> > singular and arrives with result.
> >
> > Well, it seems that the  only way how to procced is to code such
> > function by myself and take care of suitable starting values.
> >
> > Best regards.
> > Petr
> >
> > > -Original Message-
> > > From: Ivan Krylov 
> > > Sent: Friday, December 31, 2021 9:26 AM
> > > To: PIKAL Petr 
> > > Cc: r-help mailing list 
> > > Subject: Re: [R] mixture univariate distributions fit
> > >
> > > On Fri, 31 Dec 2021 07:59:11 +
> > > PIKAL Petr  wrote:
> > >
> > > > x <- (0:100)/100
> > > > y1 <- dnorm((x, mean=.3, sd=.1)
> > > > y2 <- dnorm((x, mean=.7, sd=.1)
> > > > ymix <- ((y1+2*y2)/max(y1+2*y2))
> > >
> > > > My question is if there is some package or function which could
> > > > get those values ***directly from x and ymix values***, which is
> > > > basically what is mea

Re: [R] mixture univariate distributions fit

2021-12-31 Thread PIKAL Petr
Hallo Ivan

Thanks. Yes, this approach seems to be viable. I did not consider using
dnorm in fitting procedure. But as you pointed

> (Some nonlinear least squares problems will be much harder to solve
> though.)

This simple example is quite easy. The more messy are data and the more
distributions are mixed in them the more problematic could be the correct
starting values selection. Errors could be quite common.

x <- (0:200)/100
y1 <- dnorm(x, mean=.3, sd=.1)
y2 <- dnorm(x, mean=.7, sd=.2)
y3 <- dnorm(x, mean=.5, sd=.1)

ymix <- ((y1+2*y2+y3)/max(y1+2*y2+y3))+rnorm(201, sd=.001)
plot(x, ymix)

With just sd1 and sd2 slightly higher, the fit results to error.
> fit <- minpack.lm::nlsLM(
+  ymix ~ a1 * dnorm(x, mu1, sd1) + a2 * dnorm(x, mu2, sd2)+
+  a3 * dnorm(x, mu3, sd3),
+  start = c(a1 = 1, mu1 = .3, sd1=.3, a2 = 2, mu2 = .7, sd2 =.3,
+  a3 = 1, mu3 = .5, sd3 = .1),
+  lower = rep(0, 9) # help minpack avoid NaNs
+ )
Error in nlsModel(formula, mf, start, wts) : 
  singular gradient matrix at initial parameter estimates

If sd1 and sd2 are set to lower value, the function is no longer singular
and arrives with result. 

Well, it seems that the  only way how to procced is to code such function by
myself and take care of suitable starting values. 

Best regards.
Petr

> -Original Message-
> From: Ivan Krylov 
> Sent: Friday, December 31, 2021 9:26 AM
> To: PIKAL Petr 
> Cc: r-help mailing list 
> Subject: Re: [R] mixture univariate distributions fit
> 
> On Fri, 31 Dec 2021 07:59:11 +
> PIKAL Petr  wrote:
> 
> > x <- (0:100)/100
> > y1 <- dnorm((x, mean=.3, sd=.1)
> > y2 <- dnorm((x, mean=.7, sd=.1)
> > ymix <- ((y1+2*y2)/max(y1+2*y2))
> 
> > My question is if there is some package or function which could get
> > those values ***directly from x and ymix values***, which is
> > basically what is measured in my case.
> 
> Apologies if I'm missing something, but, this being a peak fitting
> problem, shouldn't nls() (or something from the minpack.lm or nlsr
> packages) work for you here?
> 
> minpack.lm::nlsLM(
>  ymix ~ a1 * dnorm(x, mu1, sigma1) + a2 * dnorm(x, mu2, sigma2),
>  start = c(a1 = 1, mu1 = 0, sigma1 = 1, a2 = 1, mu2 = 1, sigma2 = 1),
>  lower = rep(0, 6) # help minpack avoid NaNs
> )
> # Nonlinear regression model
> #  model: ymix ~ a1 * dnorm(x, mu1, sigma1) + a2 * dnorm(x, mu2, sigma2)
> #  data: parent.frame()
> #  a1mu1 sigma1 a2mu2 sigma2
> #  0.1253 0.3000 0.1000 0.2506 0.7000 0.1000
> # residual sum-of-squares: 1.289e-31
> #
> # Number of iterations to convergence: 23
> # Achieved convergence tolerance: 1.49e-08
> 
> (Some nonlinear least squares problems will be much harder to solve
> though.)
> 
> --
> Best regards,
> Ivan
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mixture univariate distributions fit

2021-12-30 Thread PIKAL Petr
Hallo Bert

Sorry for the confusion, I may not have used correct wording in describing 
what I wanted to do. Consider this example

# 2 different distribution densities mixed together in 1:2 proportion
x <- (0:100)/100
y1 <- dnorm((x, mean=.3, sd=.1)
y2 <- dnorm((x, mean=.7, sd=.1)
ymix <- ((y1+2*y2)/max(y1+2*y2))
plot((x, ymix)

# sampling  x based on mixed distribution density
dist <- sample((x, size=1, replace=TRUE, prob=ymix)
library(mixtools)
fit <- normalmixEM(dist)
summary(fit)

summary of normalmixEM object:
 comp 1comp 2
lambda 0.334926 0.6650742
mu 0.304862 0.7036926
sigma  0.098673 0.0994382
loglik at estimate:  30935.34

I can get estimation of mu and sigma together with proportion of each (lambda) 
distribution in mixture. My question is if there is some package or function 
which could get those values ***directly from x and ymix values***, which is 
basically what is measured in my case.

I know I could code some optimisation procedure and adopt what was done in 
Python https://chrisostrouchov.com/post/peak_fit_xrd_python/ for peak fitting 
or this one 
https://stats.stackexchange.com/questions/92748/multi-peak-gaussian-fit-in-r 
but I do not like to reinvent the wheel.

I went through Chemometrics and Computational Physics task view, which I 
consider most appropriate to have such functions but it is usually quite 
specific, especially in required input data form.

Anyway, thanks for your effort.

I wish you best for new year 2022.
Petr


> -Original Message-
> From: Bert Gunter 
> Sent: Thursday, December 30, 2021 5:10 PM
> To: PIKAL Petr 
> Cc: r-help mailing list 
> Subject: Re: [R] mixture univariate distributions fit
>
> Petr:
>
> 1. I now am somewhat confused by your goals. Any curve fitting procedure
> can fit curves to data, including discrete points on a smooth curve. What
> cannot be done is to recover the exact parameters of the smooth curve from
> which your data derive, at least not without knowing how the curve was
> fitted to that underlying (particle size) data, i.e. the nature of the
> parameterization (if even there was one
> -- it may just be some sort of empirical smoother like kernel density
> estimation or some such).
>
> 2. mixtools uses EM, so the similarity you noted is probably not surprising.
>
> 3. Given my evident confusion, I hope that this email may prompt someone
> more knowledgeable than I to correct or clarify my "advice."
> You may wish to further clarify what you wish to do with the parameters that
> you hope to get to encourage such comments. For example, do you wish to
> compare the densities you get via their parameterizations? If so, others may
> be able to offer better strategies to do this, perhaps on SO --
> https://stats.stackexchange.com/  -- rather than here.
>
> Cheers,
> Bert Gunter
>
> On Thu, Dec 30, 2021 at 12:36 AM PIKAL Petr 
> wrote:
> >
> > Thank you Bert
> >
> > The values are results from particle size measurement by sedimentation
> > and they are really available only as these cumulative or density
> distributions.
> > What I thought about was that there is some package which could fit
> > data of such curves and deliver parameters of fitted curves.
> >
> > Something like
> > https://chrisostrouchov.com/post/peak_fit_xrd_python/
> >
> > I found package EMpeaksR which results close to values estimated from
> > mixtools package.
> >
> > test <- spect_em_gmm(temp1$velik, temp1$proc, mu=c(170, 220),
> > mix_ratio=c(1,1), sigma=c(5,5), maxit=2000, conv.cri=1e-8)
> > print(cbind(test$mu, test$sigma, test$mix_ratio))
> >  [,1]  [,2]  [,3]
> > [1,] 170.7744  7.200109 0.5759867
> > [2,] 229.1815 10.831626 0.4240133
> >
> > But it is probably in stage of intensive development as it is limited
> > in data visualisation
> >
> > Any further hint is appreciated.
> >
> > Regards
> > Petr
> >
> > > -Original Message-
> > > From: Bert Gunter 
> > > Sent: Wednesday, December 29, 2021 5:01 PM
> > > To: PIKAL Petr 
> > > Cc: r-help mailing list 
> > > Subject: Re: [R] mixture univariate distributions fit
> > >
> > > No.
> > >
> > > However, if the object returned is the "Value" structure of whatever
> > > density function you use, it probably contains the original data.
> > > You need to check the docs to see. But this does not appear to be your
> situation.
> > >
> > > Bert Gunter
> > >
> > > "The trouble with having an open mind is that people keep coming
> > > along and sticking things into it."
> > > -- 

Re: [R] mixture univariate distributions fit

2021-12-30 Thread PIKAL Petr
Thank you Bert

The values are results from particle size measurement by sedimentation and 
they are really available only as these cumulative or density distributions. 
What I thought about was that there is some package which could fit data of 
such curves and deliver parameters of fitted curves.

Something like
https://chrisostrouchov.com/post/peak_fit_xrd_python/

I found package EMpeaksR which results close to values estimated from mixtools 
package.

test <- spect_em_gmm(temp1$velik, temp1$proc, mu=c(170, 220), 
mix_ratio=c(1,1), sigma=c(5,5), maxit=2000, conv.cri=1e-8)
print(cbind(test$mu, test$sigma, test$mix_ratio))
 [,1]  [,2]  [,3]
[1,] 170.7744  7.200109 0.5759867
[2,] 229.1815 10.831626 0.4240133

But it is probably in stage of intensive development as it is limited in data 
visualisation

Any further hint is appreciated.

Regards
Petr

> -Original Message-
> From: Bert Gunter 
> Sent: Wednesday, December 29, 2021 5:01 PM
> To: PIKAL Petr 
> Cc: r-help mailing list 
> Subject: Re: [R] mixture univariate distributions fit
>
> No.
>
> However, if the object returned is the "Value" structure of whatever density
> function you use, it probably contains the original data. You need to check
> the docs to see. But this does not appear to be your situation.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Wed, Dec 29, 2021 at 3:05 AM PIKAL Petr 
> wrote:
> >
> > Dear all
> >
> > I have data which are either density distribution estimate or
> > cummulative density distribution estimate (temp1, temp2 below). I
> > would like to get values (mu, sd) for underlaying original data but they 
> > are
> not available.
> >
> > I found mixtools package which calculate what I need but it requires
> > original data (AFAIK). They could be generated from e.g. temp1 by
> >
> > set.seed(111)
> > x<- sample(temp1$velik, size=10, replace=TRUE, prob=temp1$proc)
> >
> > library(mixtools)
> > fit <- normalmixEM(x)
> > plot(fit, which=2)
> > summary(fit)
> > summary of normalmixEM object:
> >comp 1 comp 2
> > lambda   0.576346   0.423654
> > mu 170.784520 229.192823
> > sigma7.203491  10.793461
> > loglik at estimate:  -424062.7
> > >
> >
> > Is there any way how to get such values directly from density or
> > cummulative density estimation without generating fake data by sample?
> >
> > Best regards
> > Petr
> >
> >
> > temp1 <- structure(list(velik = c(155, 156.8, 157.9, 158.8, 159.6,
> > 160.4, 161.2, 161.9, 162.5, 163.1, 163.8, 164.3, 164.7, 165.3, 165.8,
> > 166.2, 166.7, 167.2, 167.7, 168.2, 168.7, 169.1, 169.6, 170.1, 170.6,
> > 171.1, 171.6, 172, 172.5, 173, 173.5, 174, 174.5, 175.1, 175.7, 176.3,
> > 177, 177.6, 178.3, 179.1, 179.9, 180.6, 181.4, 182.4, 183.5, 184.7,
> > 186.1, 187.9, 189.8, 192, 194.4, 197, 200.1, 203.5, 206.7, 209.2,
> > 211.3, 213.1, 214.8, 216.3, 217.4, 218.5, 219.5, 220.4, 221.3, 222.1,
> > 223, 223.7, 224.5, 225.2, 225.9, 226.7, 227.5, 228.2, 228.9, 229.6,
> > 230.4, 231.2, 231.9, 232.6, 233.4, 234.2, 235, 235.9, 236.8, 237.7,
> > 238.6, 239.7, 241, 242.3, 243.6, 245.2, 247.1, 249.3, 251.9, 255.3,
> > 260, 266, 274.9, 323.4 ), proc = c(0.6171, 1.583, 1.371, 2.13, 1.828,
> > 2.095, 1.994, 2.694, 2.824, 2.41, 2.909, 3.768, 3.179, 3.029, 3.798,
> > 3.743, 3.276, 3.213, 3.579, 2.928, 4.634, 3.415, 3.473, 3.135, 3.476,
> > 3.759, 3.726, 3.9, 3.593, 2.89, 3.707, 4.08, 2.846, 2.685, 3.394,
> > 2.737, 2.693, 2.878, 2.248, 2.368, 2.258, 2.662, 1.866, 1.895, 1.457,
> > 1.513, 1.181, 1.008, 0.9641, 0.799, 0.7878, 0.7209, 0.5869, 0.5778,
> > 0.7313, 0.9531, 1.053, 1.317, 1.247, 1.739, 2.064, 1.99, 2.522, 2.401,
> > 2.48, 2.687, 2.797, 2.918, 3.243, 3.055, 3.009, 2.89, 3.037, 3.25,
> > 3.349, 3.141, 2.771, 2.985, 3.203, 3.298, 3.215, 2.637, 2.683, 2.782,
> > 2.632, 2.625, 2.475, 2.014, 1.781, 1.987, 1.627, 1.374, 1.352, 0.9441,
> > 1.01, 0.5737, 0.5265, 0.3794, 0.2513, 0.0351)), row.names = 2:101,
> > class = "data.frame")
> >
> > temp2 <- structure(list(velik = c(153.8, 156.3, 157.3, 158.4, 159.2,
> > 160.1, 160.8, 161.6, 162.2, 162.8, 163.5, 164, 164.5, 165, 165.5, 166,
> > 166.4, 166.9, 167.5, 167.9, 168.5, 168.9, 169.4, 169.8, 170.4, 170.9,
> > 171.3, 171.8, 172.2, 172.7, 173.3, 173.8, 174.2, 174.8, 175.5, 176,
> > 176.6, 177.3, 177.9, 178.7, 179.5, 180.3, 180.9, 181.9, 182.9, 184.1,
> > 185.3, 186.9, 188.8, 190.8,

[R] mixture univariate distributions fit

2021-12-29 Thread PIKAL Petr
Dear all

I have data which are either density distribution estimate or cummulative
density distribution estimate (temp1, temp2 below). I would like to get
values (mu, sd) for underlaying original data but they are not available.

I found mixtools package which calculate what I need but it requires
original data (AFAIK). They could be generated from e.g. temp1 by

set.seed(111)
x<- sample(temp1$velik, size=10, replace=TRUE, prob=temp1$proc)

library(mixtools)
fit <- normalmixEM(x)
plot(fit, which=2)
summary(fit)
summary of normalmixEM object:
   comp 1 comp 2
lambda   0.576346   0.423654
mu 170.784520 229.192823
sigma7.203491  10.793461
loglik at estimate:  -424062.7 
>

Is there any way how to get such values directly from density or cummulative
density estimation without generating fake data by sample? 

Best regards
Petr


temp1 <- structure(list(velik = c(155, 156.8, 157.9, 158.8, 159.6, 160.4, 
161.2, 161.9, 162.5, 163.1, 163.8, 164.3, 164.7, 165.3, 165.8, 
166.2, 166.7, 167.2, 167.7, 168.2, 168.7, 169.1, 169.6, 170.1, 
170.6, 171.1, 171.6, 172, 172.5, 173, 173.5, 174, 174.5, 175.1, 
175.7, 176.3, 177, 177.6, 178.3, 179.1, 179.9, 180.6, 181.4, 
182.4, 183.5, 184.7, 186.1, 187.9, 189.8, 192, 194.4, 197, 200.1, 
203.5, 206.7, 209.2, 211.3, 213.1, 214.8, 216.3, 217.4, 218.5, 
219.5, 220.4, 221.3, 222.1, 223, 223.7, 224.5, 225.2, 225.9, 
226.7, 227.5, 228.2, 228.9, 229.6, 230.4, 231.2, 231.9, 232.6, 
233.4, 234.2, 235, 235.9, 236.8, 237.7, 238.6, 239.7, 241, 242.3, 
243.6, 245.2, 247.1, 249.3, 251.9, 255.3, 260, 266, 274.9, 323.4
), proc = c(0.6171, 1.583, 1.371, 2.13, 1.828, 2.095, 1.994, 
2.694, 2.824, 2.41, 2.909, 3.768, 3.179, 3.029, 3.798, 3.743, 
3.276, 3.213, 3.579, 2.928, 4.634, 3.415, 3.473, 3.135, 3.476, 
3.759, 3.726, 3.9, 3.593, 2.89, 3.707, 4.08, 2.846, 2.685, 3.394, 
2.737, 2.693, 2.878, 2.248, 2.368, 2.258, 2.662, 1.866, 1.895, 
1.457, 1.513, 1.181, 1.008, 0.9641, 0.799, 0.7878, 0.7209, 0.5869, 
0.5778, 0.7313, 0.9531, 1.053, 1.317, 1.247, 1.739, 2.064, 1.99, 
2.522, 2.401, 2.48, 2.687, 2.797, 2.918, 3.243, 3.055, 3.009, 
2.89, 3.037, 3.25, 3.349, 3.141, 2.771, 2.985, 3.203, 3.298, 
3.215, 2.637, 2.683, 2.782, 2.632, 2.625, 2.475, 2.014, 1.781, 
1.987, 1.627, 1.374, 1.352, 0.9441, 1.01, 0.5737, 0.5265, 0.3794, 
0.2513, 0.0351)), row.names = 2:101, class = "data.frame")

temp2 <- structure(list(velik = c(153.8, 156.3, 157.3, 158.4, 159.2, 160.1, 
160.8, 161.6, 162.2, 162.8, 163.5, 164, 164.5, 165, 165.5, 166, 
166.4, 166.9, 167.5, 167.9, 168.5, 168.9, 169.4, 169.8, 170.4, 
170.9, 171.3, 171.8, 172.2, 172.7, 173.3, 173.8, 174.2, 174.8, 
175.5, 176, 176.6, 177.3, 177.9, 178.7, 179.5, 180.3, 180.9, 
181.9, 182.9, 184.1, 185.3, 186.9, 188.8, 190.8, 193.2, 195.6, 
198.4, 201.8, 205.3, 208.1, 210.3, 212.3, 213.9, 215.7, 216.9, 
218, 219.1, 219.9, 220.8, 221.7, 222.6, 223.4, 224.1, 224.8, 
225.6, 226.3, 227.1, 227.8, 228.5, 229.2, 230, 230.8, 231.6, 
232.3, 233, 233.7, 234.6, 235.5, 236.3, 237.2, 238.1, 239.1, 
240.3, 241.6, 242.9, 244.4, 246.1, 248, 250.6, 253.1, 257.6, 
262.5, 269.5, 280.4, 372.9), proc = c(0, 1, 2, 3, 4, 5, 6, 7, 
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 
72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)), row.names = c(NA, 
101L), class = "data.frame")
>

>
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about Rfast colMins and colMaxs

2021-12-01 Thread PIKAL Petr
Hi.

It is always worth to consult excellent R help.

max and min return the maximum or minimum of all the values present in their 
arguments, as integer if all are logical or integer, as double if all are 
numeric, and character otherwise.

Character versions are sorted lexicographically, and this depends on the 
collating sequence of the locale in use: the help for ‘Comparison’ gives 
details. The max/min of an empty character vector is defined to be character 
NA. (One could argue that as "" is the smallest character element, the maximum 
should be "", but there is no obvious candidate for the minimum.)

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Stephen H.
> Dawson, DSL via R-help
> Sent: Wednesday, December 1, 2021 5:11 PM
> To: r-help@r-project.org
> Subject: Re: [R] Question about Rfast colMins and colMaxs
> 
> Jeff,
> 
> Can you use max and min evaluations on any other data type then numeric?
> If so, how do you evaluate max or min of text content? String length?
> Ascii values of text characters?
> 
> 
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com 
> 
> 
> On 11/30/21 5:23 PM, Stephen H. Dawson, DSL via R-help wrote:
> > Well, no it is not. The email list stripped off the attachment.
> >
> > The data is numeric, happens to be all whole numbers.
> >
> >
> > Kindest Regards,
> > *Stephen Dawson, DSL*
> > /Executive Strategy Consultant/
> > Business & Technology
> > +1 (865) 804-3454
> > http://www.shdawson.com 
> >
> >
> > On 11/30/21 5:14 PM, Stephen H. Dawson, DSL via R-help wrote:
> >> Hi Jeff,
> >>
> >>
> >> Thanks for the data review offer. Attached is the CSV.
> >>
> >>
> >> *Stephen Dawson, DSL*
> >> /Executive Strategy Consultant/
> >> Business & Technology
> >> +1 (865) 804-3454
> >> http://www.shdawson.com 
> >>
> >>
> >> On 11/30/21 3:29 PM, Jeff Newmiller wrote:
> >>> I don't know anything about this package, but read.csv returns a
> >>> data frame. How you go about forming a matrix using that data frame
> >>> depends what is in it. If it is all numeric then as.matrix may be
> >>> all you need.
> >>>
> >>> Half of any R data analysis is data... and the details are almost
> >>> always crucial. Since you have told us nothing useful about the
> >>> data, it is up to you to inspect your data and figure out what to do
> >>> with it.
> >>>
> >>> On November 30, 2021 10:55:13 AM PST, "Stephen H. Dawson, DSL via
> >>> R-help"  wrote:
>  Hi,
> 
> 
>  I am working to understand the Rfast functions of colMins and
>  colMaxs. I worked through the example listed on page 54 of the PDF.
> 
>  https://cran.r-project.org/web/packages/Rfast/index.html
> 
>  https://cran.r-project.org/web/packages/Rfast/Rfast.pdf
> 
>  My data is in a CSV file. So, I bring it into R Studio using:
>  Data <- read.csv("./input/DataSet05.csv", header=T)
> 
>  However, I read the instructions listed on page 54 of the PDF
>  saying I need to bring data into R using a matrix. I think read.csv
>  brings the data in as a dataframe. I think colMins is failing
>  because it is looking for a matrix but finds a dataframe.
> 
> > colMaxs(Data)
>  Error in colMaxs(Data) :
> Not compatible with requested type: [type=list; target=double].
> > colMins(Data, na.rm = TRUE)
>  Error in colMins(Data, na.rm = TRUE) :
> unused argument (na.rm = TRUE)
> > colMins(Data, value = FALSE, parallel = FALSE)
>  Error in colMins(Data, value = FALSE, parallel = FALSE) :
> Not compatible with requested type: [type=list; target=double].
> 
>  QUESTION
>  What is the best practice to bring a csv file into R Studio so it
>  can be accessed by colMaxs and colMins, please?
> 
> 
>  Thanks,
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org 

Re: [R] converting to POSIXct

2021-11-30 Thread PIKAL Petr
Hi, 

that is what I tried to show to Stefano. That the issue was only with printing 
the values on console or as you explained in more depth the default formating 
when all time values (HMS) are zero.

So that Stefano used correct syntax.

Cheers
Petr

> -Original Message-
> From: Duncan Murdoch 
> Sent: Tuesday, November 30, 2021 12:05 PM
> To: Jim Lemon ; PIKAL Petr
> 
> Cc: r-help mailing list ; Stefano Sofia
> 
> Subject: Re: [R] converting to POSIXct
> 
> On 30/11/2021 3:41 a.m., Jim Lemon wrote:
> > Hi,
> > Petr is right. Apparently as.POSIXct drops the smallest increments if
> > all are zero:
> 
> That's not as.POSIXct doing anything:  there's no way to drop increments, the
> POSIXct format records a number of seconds and that can't be changed.
> 
> What is happening is simply the default formatting.
> 
> Be explicit about the format if you want to see the seconds, e.g.
> 
>  > format(ssdf$data_POSIX, format = '%Y-%m-%d %H:%M:%S') [1] "2002-11-
> 01 00:00:00" "2002-11-01 00:00:00"
> 
> Duncan Murdoch
> 
> >
> > ssdf<-read.csv(text="data_POSIX,Sensor_code,value
> > 2002-11-01 00:00:01,1694,7.2
> > 2002-11-01 00:00:00,1723,10.8",
> > stringsAsFactors=FALSE)
> > ssdf$data_POSIX<-as.POSIXct(ssdf$data_POSIX,"%Y-%m-%d HH:MM:SS")
> ssdf
> >
> > data_POSIX Sensor_code value
> > 1 2002-11-01 00:00:011694   7.2
> > 2 2002-11-01 00:00:001723  10.8
> >
> > but if there is a single small increment, they all show up.
> >
> > Jim
> >
> > On Tue, Nov 30, 2021 at 7:33 PM PIKAL Petr 
> wrote:
> >>
> >> Hi
> >>
> >> You probably has zero hours in all your data
> >>
> >> see
> >>> temp
> >> data_POSIX Sensor_code value
> >> 1 2002-11-01 00:00:001694   7.2
> >> 2 2002-11-01 00:00:001723  10.8
> >>
> >> without hours
> >>> as.POSIXct(temp$data_POSIX, format = "%Y-%m-%d %H:%M:%S",
> >>> tz="Etc/GMT-1")
> >> [1] "2002-11-01 +01" "2002-11-01 +01"
> >>
> >> add value to hours
> >>> fix(temp)
> >>> temp
> >> data_POSIX Sensor_code value
> >> 1 2002-11-01 00:01:001694   7.2
> >> 2 2002-11-01 00:00:001723  10.8
> >>
> >> Voila, hours are back.
> >>> as.POSIXct(temp$data_POSIX, format = "%Y-%m-%d %H:%M:%S",
> >>> tz="Etc/GMT-1")
> >> [1] "2002-11-01 00:01:00 +01" "2002-11-01 00:00:00 +01"
> >>
> >> So nothing wrong in uyour code, hours are there but they are probably
> not printed to console and hours are there but hidden.
> >>
> >> Cheers
> >> Petr
> >>
> >>> -Original Message-
> >>> From: R-help  On Behalf Of Stefano
> >>> Sofia
> >>> Sent: Tuesday, November 30, 2021 9:20 AM
> >>> To: r-help mailing list 
> >>> Subject: [R] converting to POSIXct
> >>>
> >>> Dear R-list users,
> >>> I thought I was able to manage easily POSIXct, but this is not true.
> >>> I am not going to load the input txt file because I know that
> >>> attachments are not allowed. The structure of my input txt file is
> >>>
> >>> data_POSIX,Sensor_code,value
> >>> 2002-11-01 00:00:00,1694,7.2
> >>> 2002-11-01 00:00:00,1723,10.8
> >>> ...
> >>>
> >>> I load it with
> >>> myfile <- read.table(file="mypath/myfile.txt", header = TRUE,
> >>> sep=",", dec = ".", stringsAsFactors = FALSE)
> >>>
> >>> When I try to convert the data_POSIX column (which is a character)
> >>> to POSIXct with
> >>>
> >>> myfile$data_POSIX <- as.POSIXct(myfile$data_POSIX, format =
> >>> "%Y-%m-%d %H:%M:%S", tz="Etc/GMT-1")
> >>>
> >>> the outupt is
> >>>
> >>> 2002-11-01 1694 7.2
> >>> 2002-11-01 1723 10.8
> >>> ...
> >>>
> >>> Why I keep loosing hours, minutes and seconds? Wher eis my mistake
> >>> or my misunderstanding?
> >>>
> >>> Sorry again if I have not been able to reproduce the R code, and
> >>> thank you for your support.
> >>> Stefano
> >>>
> >>>   (oo)
> >>> --oOO--( )--OOo--

Re: [R] converting to POSIXct

2021-11-30 Thread PIKAL Petr
Hi

You probably has zero hours in all your data

see
> temp
   data_POSIX Sensor_code value
1 2002-11-01 00:00:001694   7.2
2 2002-11-01 00:00:001723  10.8

without hours
> as.POSIXct(temp$data_POSIX, format = "%Y-%m-%d %H:%M:%S", tz="Etc/GMT-1")
[1] "2002-11-01 +01" "2002-11-01 +01"

add value to hours
> fix(temp)
> temp
   data_POSIX Sensor_code value
1 2002-11-01 00:01:001694   7.2
2 2002-11-01 00:00:001723  10.8

Voila, hours are back.
> as.POSIXct(temp$data_POSIX, format = "%Y-%m-%d %H:%M:%S", tz="Etc/GMT-1")
[1] "2002-11-01 00:01:00 +01" "2002-11-01 00:00:00 +01"

So nothing wrong in uyour code, hours are there but they are probably not 
printed to console and hours are there but hidden. 

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Stefano Sofia
> Sent: Tuesday, November 30, 2021 9:20 AM
> To: r-help mailing list 
> Subject: [R] converting to POSIXct
> 
> Dear R-list users,
> I thought I was able to manage easily POSIXct, but this is not true.
> I am not going to load the input txt file because I know that attachments are
> not allowed. The structure of my input txt file is
> 
> data_POSIX,Sensor_code,value
> 2002-11-01 00:00:00,1694,7.2
> 2002-11-01 00:00:00,1723,10.8
> ...
> 
> I load it with
> myfile <- read.table(file="mypath/myfile.txt", header = TRUE, sep=",", dec =
> ".", stringsAsFactors = FALSE)
> 
> When I try to convert the data_POSIX column (which is a character) to
> POSIXct with
> 
> myfile$data_POSIX <- as.POSIXct(myfile$data_POSIX, format = "%Y-%m-%d
> %H:%M:%S", tz="Etc/GMT-1")
> 
> the outupt is
> 
> 2002-11-01 1694 7.2
> 2002-11-01 1723 10.8
> ...
> 
> Why I keep loosing hours, minutes and seconds? Wher eis my mistake or my
> misunderstanding?
> 
> Sorry again if I have not been able to reproduce the R code, and thank you
> for your support.
> Stefano
> 
>  (oo)
> --oOO--( )--OOo--
> Stefano Sofia PhD
> Civil Protection - Marche Region - Italy Meteo Section Snow Section Via del
> Colle Ameno 5
> 60126 Torrette di Ancona, Ancona (AN)
> Uff: +39 071 806 7743
> E-mail: stefano.so...@regione.marche.it
> ---Oo-oO
> 
> 
> 
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere
> informazioni confidenziali, pertanto è destinato solo a persone autorizzate
> alla ricezione. I messaggi di posta elettronica per i client di Regione Marche
> possono contenere informazioni confidenziali e con privilegi legali. Se non 
> si è
> il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
> questo
> messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al
> mittente ed eliminarlo completamente dal sistema del proprio computer. Ai
> sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità 
> ed
> urgenza, la risposta al presente messaggio di posta elettronica può essere
> visionata da persone estranee al destinatario.
> IMPORTANT NOTICE: This e-mail message is intended to be received only by
> persons entitled to receive the confidential information it may contain. 
> E-mail
> messages to clients of Regione Marche may contain information that is
> confidential and legally privileged. Please do not read, copy, forward, or 
> store
> this message unless you are an intended recipient of it. If you have received
> this message in error, please forward it to the sender and delete it
> completely from your computer system.
> 
> --
> Questo messaggio  stato analizzato da Libraesva ESG ed  risultato non infetto.
> This message was scanned by Libraesva ESG and is believed to be clean.
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectorization of loops in R

2021-11-18 Thread PIKAL Petr
Hi

above tapply and aggregate, split *apply could be used)

sapply(with(df, split(z, y)), mean)

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Luigi Marongiu
> Sent: Wednesday, November 17, 2021 2:21 PM
> To: r-help 
> Subject: [R] vectorization of loops in R
> 
> Hello,
> I have a dataframe with 3 variables. I want to loop through it to get
> the mean value of the variable `z`, as follows:
> ```
> df = data.frame(x = c(rep(1,5), rep(2,5), rep(3,5)),
> y = rep(letters[1:5],3),
> z = rnorm(15),
> stringsAsFactors = FALSE)
> m = vector()
> for (i in unique(df$y)) {
> s = df[df$y == i,]
> m = append(m, mean(s$z))
> }
> names(m) = unique(df$y)
> > (m)
> a  b  c  d  e
> -0.6355382 -0.4218053 -0.7256680 -0.8320783 -0.2587004
> ```
> The problem is that I have one million `y` values, so the work takes
> almost a day. I understand that vectorization will speed up the
> procedure. But how shall I write the procedure in vectorial terms?
> Thank you
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date

2021-11-04 Thread PIKAL Petr
Hi

Not sure why the date format was changed but if I am correct R do not read 
dates as dates but as character vector. You need to transfer such columns to 
dates by asDate. The error is probably from your use two asDate commands.

Cheers
Petr
-Original Message-
From: R-help  On Behalf Of Val
Sent: Thursday, November 4, 2021 10:43 PM
To: r-help@R-project.org (r-help@r-project.org) 
Subject: [R] Date

IHi All, l,

I am  reading a csv file  and one of the columns is named as  "mydate"
 with this form, 2019-09-16.

I am reading this file as

dat=read.csv("myfile.csv")
 the structure of the data looks like as follow

str(dat)
mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...

Please note the format  has  changed from -mm-dd  to mm/dd/
When I tried to change this   as a Date using

as.Date(as.Date(mydate, format="%m/%d/%Y" )
I am getting this error message
Error in charToDate(x) :
  characte string is not in a standard unambiguous format

My question is,
1. how can I read the file as it is (i.e., without changing the date format) ?
2. why does R change the date format?

Thank you,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] by group

2021-11-02 Thread PIKAL Petr
Hi

Although you got several answers, simple aggregate was omitted.

> with(dat, aggregate(wt, list(Year=Year, Sex=Sex), mean))
  Year Sexx
1 2001   F 12.0
2 2002   F 13.3
3 2003   F 12.0
4 2001   M 15.0
5 2002   M 16.3
6 2003   M 15.0

you can reshape the result 
> library(reshape2)
Warning message:
package 'reshape2' was built under R version 4.0.4 
> dcast(res, Year~Sex)
Using x as value column: use value.var to override.
  YearFM
1 2001 12.0 15.0
2 2002 13.3 16.3
3 2003 12.0 15.0

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Val
> Sent: Monday, November 1, 2021 10:08 PM
> To: r-help@R-project.org (r-help@r-project.org) 
> Subject: [R] by group
> 
> Hi All,
> 
> How can I generate mean by group. The sample data looks like as follow,
> dat<-read.table(text="Year Sex wt
> 2001 M 15
> 2001 M 14
> 2001 M 16
> 2001 F 12
> 2001 F 11
> 2001 F 13
> 2002 M 14
> 2002 M 18
> 2002 M 17
> 2002 F 11
> 2002 F 15
> 2002 F 14
> 2003 M 18
> 2003 M 13
> 2003 M 14
> 2003 F 15
> 2003 F 10
> 2003 F 11  ",header=TRUE)
> 
> The desired  output  is,
>  MF
> 20011512
> 200216.33   13.33
> 200315  12
> 
> Thank you,
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] customize the step value

2021-10-29 Thread PIKAL Petr
Hi

One has to be careful when using fractions in seq step.

Although it works for 0.5
> (seq(0,10, .5) - round(seq(0,10,.5),2))==0
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
[16] TRUE TRUE TRUE TRUE TRUE TRUE

in case of 0.3 (or others) it does not always result in expected values (see
FAQ 7.31 for explanation)

> (seq(0,10, .3) - round(seq(0,10,.3),2))==0
 [1]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE
[13] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE
[25] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE
>
Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Erich
Subscriptions
> Sent: Friday, October 29, 2021 9:17 AM
> To: Catherine Walt 
> Cc: R mailing list 
> Subject: Re: [R] customize the step value
> 
> seq(1.5,3.5,0.5)
> 
> The docs for seq will show you many more options.
> 
> > On 29.10.2021, at 09:06, Catherine Walt  wrote:
> >
> > dear members,
> >
> > Sorry I am newbie on R.
> > as we saw below:
> >
> >> 1.5:3.5
> > [1] 1.5 2.5 3.5
> >
> > How can I make the step to 0.5?
> > I want the result:
> >
> > 1.5 2.0 2.5 3.0 3.5
> >
> > Thanks.
> > Cathy
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replacing NA s with the average

2021-10-18 Thread PIKAL Petr
Hi.

sometimes is worth to try google first

R fill NA with average

resulted in

https://stackoverflow.com/questions/25835643/replace-missing-values-with-col
umn-mean

and from that

library(zoo)
na.aggregate(DF)

will replace all numeric NA values with column averages.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Admire Tarisirayi
> Chirume
> Sent: Monday, October 18, 2021 2:39 PM
> To: Jim Lemon 
> Cc: r-help mailing list 
> Subject: [R] Replacing NA s with the average
> 
> Good day colleagues. Below is a csv file attached which i am using in my
> > analysis.
> >
> >
> >
> > household.id 
> >
> > hd17.perm
> >
> > hd17employ
> >
> > health.exp
> >
> > total.food.exp
> >
> > total.nfood.exp
> >
> > 1
> >
> > 2
> >
> > yes
> >
> > 1654
> >
> > 23654
> >
> > 23655
> >
> > 2
> >
> > 2
> >
> > yes
> >
> > NA
> >
> > NA
> >
> > 65984
> >
> > 3
> >
> > 6
> >
> > no
> >
> > 2547
> >
> > 123311
> >
> > 52416
> >
> > 4
> >
> > 8
> >
> > NA
> >
> > 2365
> >
> > 13648
> >
> > 12544
> >
> > 5
> >
> > 6
> >
> > NA
> >
> > 1254
> >
> > 36549
> >
> > 12365
> >
> > 6
> >
> > 8
> >
> > yes
> >
> > 1236
> >
> > 236541
> >
> > 26522
> >
> > 7
> >
> > 8
> >
> > no
> >
> > NA
> >
> > 13264
> >
> > 23698
> >
> >
> >
> >
> >
> > So I created a df using the above and its a csv file as follows
> >
> > wbpractice <- read.csv("world_practice.csv")
> >
> > Now i am doing data cleaning and trying to replace all missing values
> > with the averages of the respective columns.
> >
> > the dimension of the actual dataset is;
> >
> > dim(wbpractice)
> [1] 319986
> 
> I used the following script which i executed by i got some error messages
> 
> for(i in 1:ncol( wbpractice  )){
>  wbpractice  [is.na( wbpractice  [,i]), i] <- mean( wbpractice  [,i],
na.rm =
> TRUE)
> }
> 
> Any help to replace all NAs with average values in my dataframe?
> 
> 
> 
> >
> >>
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new variable and merging it on the dataframe

2021-10-18 Thread PIKAL Petr
Hi

I cannot say anything about mutate but

read.csv results in data frame

you can use then

wbpractice$gap <- with(wbpractice, total.food.exp-total.nfood.exp)

Cheers
Petr

BTW, do not use HTML formating your email is a mess.


> -Original Message-
> From: R-help  On Behalf Of Admire Tarisirayi
> Chirume
> Sent: Monday, October 18, 2021 1:26 PM
> To: Jim Lemon 
> Cc: r-help mailing list 
> Subject: [R] creating a new variable and merging it on the dataframe
> 
> Good day colleagues. Below is a csv file attached which i am using in my
> analysis.
> 
> 
> 
> hh.id
> 
> hd17.perm
> 
> hd17employ
> 
> health.exp
> 
> total.food.exp
> 
> total.nfood.exp
> 
> 1
> 
> 2
> 
> yes
> 
> 1654
> 
> 23654
> 
> 23655
> 
> 2
> 
> 2
> 
> yes
> 
> 2564
> 
> 265897
> 
> 65984
> 
> 3
> 
> 6
> 
> no
> 
> 2547
> 
> 123311
> 
> 52416
> 
> 4
> 
> 8
> 
> no
> 
> 5698
> 
> 13648
> 
> 12544
> 
> 5
> 
> 6
> 
> no
> 
> 1254
> 
> 36549
> 
> 12365
> 
> 6
> 
> 8
> 
> yes
> 
> 1236
> 
> 236541
> 
> 26522
> 
> 7
> 
> 8
> 
> no
> 
> 4521
> 
> 13264
> 
> 23698
> 
> 
> 
> 
> 
> So I created a df using the above csv file as follows
> 
> wbpractice <- read.csv("world_practice.csv")
> 
> Now, I wanted to create a new variable called gap and scripted and
executed
> the following command :
> 
> wbpractice %>%
> 
> mutate(gap = total.food.exp-total.nfood.exp)  #gen a variable
> 
> 
> 
> By recalling  wbpractice, I could not see the new variable created.
Running
> the command;
> 
> names(wbpractice)
> 
> 
> 
> shows the old variables only. Any help on how to append the newly created
> variable on my data?
> 
> 
> Alternative email: addtar...@icloud.com/tchir...@rbz.co.zw
> Skype: admirechirume
> Call: +263773369884
> whatsapp: +818099861504
> 
> 
> 
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] complicated sequence with preset length

2021-10-12 Thread PIKAL Petr
Dear all

I know it is quite easy to get a simple seqence by rep function
> c(rep(1:3, 2), rep(4:6,2))
 [1] 1 2 3 1 2 3 4 5 6 4 5 6

I could easily get vector of length 24 or 36 using another rep

> rep(c(rep(1:3, 2), rep(4:6,2)),2)
 [1] 1 2 3 1 2 3 4 5 6 4 5 6 1 2 3 1 2 3 4 5 6 4 5 6

> length(rep(c(rep(1:3, 2), rep(4:6,2)),2))
[1] 24
> length(rep(c(rep(1:3, 2), rep(4:6,2)),3))
[1] 36

But what about vector of length 30 i.e. 

> length(c(rep(c(rep(1:3, 2), rep(4:6,2)),2), rep(1:3,2)))
[1] 30

I know I could make some if construction based on known vector length but is
there a way to use "vector recycling" if I know the desired length and want
to "fill in" the values to get required length vector?

Here is my complicated solution

len <- 30
vec1 <- rep(1:3, 2)
vec2 <- rep(4:6, 2)
base.vec <- c(vec1, vec2)
base.len <- length(c(vec1, vec2))
part <- (len/base.len-trunc(len/base.len))
result <- if (part==0) rep(c(vec1,vec2), len/base.len) else
c(rep(c(vec1,vec2), len/base.len), base.vec[1:(base.len*part)])

Is there any way how to achieve the result by simpler way?

Best regards
Petr
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected behavior in apply

2021-10-11 Thread PIKAL Petr
Hi

it is not surprising at all.

from apply documentation

Arguments
X   
an array, including a matrix.

data.frame is not matrix or array (even if it rather resembles one)

So if you put a cake into oven you cannot expect getting fried potatoes from
it.

For data frames sapply or lapply is preferable as it is designed for lists
and data frame is (again from documentation)

A data frame is a list of variables of the same number of rows with unique
row names, given class "data.frame".

> sapply(d,function(x) all(x[!is.na(x)]<=3))
   d1d2d3 
FALSE  TRUE FALSE 

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Jiefei Wang
> Sent: Friday, October 8, 2021 8:22 PM
> To: Derickson, Ryan, VHA NCOD 
> Cc: r-help@r-project.org
> Subject: Re: [R] unexpected behavior in apply
> 
> Ok, it turns out that this is documented, even though it looks surprising.
> 
> First of all, the apply function will try to convert any object with the
dim
> attribute to a matrix(my intuition agrees with you that there should be no
> conversion), so the first step of the apply function is
> 
> > as.matrix.data.frame(d)
>  d1  d2  d3
> [1,] "a" "1" NA
> [2,] "b" "2" NA
> [3,] "c" "3" " 6"
> 
> Since the data frame `d` is a mixture of character and non-character
values,
> the non-character value will be converted to the character using the
function
> `format`. However, the problem is that the NA value will also be formatted
to
> the character
> 
> > format(c(NA, 6))
> [1] "NA" " 6"
> 
> That's where the space comes from. It is purely for making the result
pretty...
> The character NA will be removed later, but the space is not stripped. I
would
> say this is not a good design, and it might be worth not including the NA
value
> in the format function. At the current stage, I will suggest using the
function
> `lapply` to do what you want.
> 
> > lapply(d, FUN=function(x)all(x[!is.na(x)] <= 3))
> $d1
> [1] FALSE
> $d2
> [1] TRUE
> $d3
> [1] FALSE
> 
> Everything should work as you expect.
> 
> Best,
> Jiefei
> 
> On Sat, Oct 9, 2021 at 2:03 AM Jiefei Wang  wrote:
> >
> > Hi,
> >
> > I guess this can tell you what happens behind the scene
> >
> >
> > > d<-data.frame(d1 = letters[1:3],
> > +   d2 = c(1,2,3),
> > +   d3 = c(NA,NA,6))
> > > apply(d, 2, FUN=function(x)x)
> >  d1  d2  d3
> > [1,] "a" "1" NA
> > [2,] "b" "2" NA
> > [3,] "c" "3" " 6"
> > > "a"<=3
> > [1] FALSE
> > > "2"<=3
> > [1] TRUE
> > > "6"<=3
> > [1] FALSE
> >
> > Note that there is an additional space in the character value " 6",
> > that's why your comparison fails. I do not understand why but this
> > might be a bug in R
> >
> > Best,
> > Jiefei
> >
> > On Sat, Oct 9, 2021 at 1:49 AM Derickson, Ryan, VHA NCOD via R-help
> >  wrote:
> > >
> > > Hello,
> > >
> > > I'm seeing unexpected behavior when using apply() compared to a for
> loop when a character vector is part of the data subjected to the apply
> statement. Below, I check whether all non-missing values are <= 3. If I
> include a character column, apply incorrectly returns TRUE for d3. If I
only
> pass the numeric columns to apply, it is correct for d3. If I use a for
loop, it is
> correct.
> > >
> > > > d<-data.frame(d1 = letters[1:3],
> > > +   d2 = c(1,2,3),
> > > +   d3 = c(NA,NA,6))
> > > >
> > > > d
> > >   d1 d2 d3
> > > 1  a  1 NA
> > > 2  b  2 NA
> > > 3  c  3  6
> > > >
> > > > # results are incorrect
> > > > apply(d, 2, FUN=function(x)all(x[!is.na(x)] <= 3))
> > >d1d2d3
> > > FALSE  TRUE  TRUE
> > > >
> > > > # results are correct
> > > > apply(d[,2:3], 2, FUN=function(x)all(x[!is.na(x)] <= 3))
> > >d2d3
> > >  TRUE FALSE
> > > >
> > > > # results are correct
> > > > for(i in names(d)){
> > > +   print(all(d[!is.na(d[,i]),i] <= 3)) }
> > > [1] FALSE
> > > [1] TRUE
> > > [1] FALSE
> > >
> > >
> > > Finally, if I remove the NA values from d3 and include the character
> column in apply, it is correct.
> > >
> > > > d<-data.frame(d1 = letters[1:3],
> > > +   d2 = c(1,2,3),
> > > +   d3 = c(4,5,6))
> > > >
> > > > d
> > >   d1 d2 d3
> > > 1  a  1  4
> > > 2  b  2  5
> > > 3  c  3  6
> > > >
> > > > # results are correct
> > > > apply(d, 2, FUN=function(x)all(x[!is.na(x)] <= 3))
> > >d1d2d3
> > > FALSE  TRUE FALSE
> > >
> > >
> > > Can someone help me understand what's happening?
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and 

Re: [R] adding results to plot

2021-10-07 Thread PIKAL Petr
Hallo Rui.

I finally tested your function and it seems to me that it should propagate
to the core R or at least to the stats package.

Although it is a bit overkill for my purpose, its use is straightforward and
simple. I checked it for several *test functions and did not find any
problem.

Thanks and best regards.

Petr

> -Original Message-
> From: Rui Barradas 
> Sent: Friday, September 17, 2021 9:56 PM
> To: PIKAL Petr ; r-help 
> Subject: Re: [R] adding results to plot
> 
> Hello,
> 
> *.test functions in base R return a list of class "htest", with its own
> print method.
> The method text.htest for objects of class "htest" below is a hack. I
> adapted the formating part of the code of print.htest to plot text().
> I find it maybe too complicated but it seems to work.
> 
> Warning: Not debugged at all.
> 
> 
> 
> text.htest <- function (ht, x, y = NULL, digits = getOption("digits"),
>  prefix = "", adj = NULL, ...) {
>out <- list()
>i_out <- 1L
>out[[i_out]] <- paste(strwrap(ht$method, prefix = prefix), sep = "\n")
>i_out <- i_out + 1L
>out[[i_out]] <- paste0("data:  ", ht$data.name)
> 
>stat_line <- NULL
>i_stat_line <- 0L
>if (!is.null(ht$statistic)) {
>  i_stat_line <- i_stat_line + 1L
>  stat_line[[i_stat_line]] <- paste(names(ht$statistic), "=",
>format(ht$statistic, digits =
> max(1L, digits - 2L)))
>}
>if (!is.null(ht$parameter)) {
>  i_stat_line <- i_stat_line + 1L
>  stat_line[[i_stat_line]] <- paste(names(ht$parameter), "=",
>format(ht$parameter, digits =
> max(1L, digits - 2L)))
>}
>if (!is.null(ht$p.value)) {
>  fp <- format.pval(ht$p.value, digits = max(1L, digits - 3L))
>  i_stat_line <- i_stat_line + 1L
>  stat_line[[i_stat_line]] <- paste("p-value",
>if (startsWith(fp, "<")) fp else
> paste("=", fp))
>}
>if(!is.null(stat_line)){
>  i_out <- i_out + 1L
>  #out[[i_out]] <- strwrap(paste(stat_line, collapse = ", "))
>  out[[i_out]] <- paste(stat_line, collapse = ", ")
>}
>if (!is.null(ht$alternative)) {
>  alt <- NULL
>  i_alt <- 1L
>  alt[[i_alt]] <- "alternative hypothesis: "
>  if (!is.null(ht$null.value)) {
>if (length(ht$null.value) == 1L) {
>  alt.char <- switch(ht$alternative, two.sided = "not equal to",
> less = "less than", greater = "greater than")
>  i_alt <- i_alt + 1L
>  alt[[i_alt]] <- paste0("true ", names(ht$null.value), " is ",
> alt.char,
> " ", ht$null.value)
>}
>else {
>  i_alt <- i_alt + 1L
>  alt[[i_alt]] <- paste0(ht$alternative, "\nnull values:\n")
>}
>  }
>  else {
>i_alt <- i_alt + 1L
>alt[[i_alt]] <- ht$alternative
>  }
>  i_out <- i_out + 1L
>  out[[i_out]] <- paste(alt, collapse = " ")
>}
>if (!is.null(ht$conf.int)) {
>  i_out <- i_out + 1L
>  out[[i_out]] <- paste0(format(100 * attr(ht$conf.int, "conf.level")),
> " percent confidence interval:\n", " ",
> paste(format(ht$conf.int[1:2], digits =
> digits), collapse = " "))
>}
>if (!is.null(ht$estimate)) {
>  i_out <- i_out + 1L
>  out[[i_out]] <- paste("sample estimates:", round(ht$estimate,
> digits = digits), sep = "\n")
>}
>i_out <- i_out + 1L
>out[[i_out]] <- "\n"
>names(out)[i_out] <- "sep"
>out <- do.call(paste, out)
>if(is.null(adj)) adj <- 0L
>text(x, y, labels = out, adj = adj, ...)
>invisible(out)
> }
> 
> 
> res <- shapiro.test(rnorm(100))
> plot(1,1, ylim = c(0, length(res) + 1L))
> text(res, 0.6, length(res) - 1)
> res
> 
> res2 <- t.test(rnorm(100))
> plot(1,1, ylim = c(0, length(res2) + 1L))
> text(res2, 0.6, length(res2) - 1L)
> res2
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
> 
> Às 15:12 de 16/09/21, PIKAL Petr escreveu:
> > Dear all
> >
> > I know I have seen the answer somewhere but I am not able to find it.
> Please
> > help
> >
> 

Re: [R] Strange behavior of 2-d array within function

2021-10-07 Thread PIKAL Petr
Hi

I would print/save iteration number to see at what time this occured and
probably traceback() could give you some hint.
Alternatively you could make a function from your code see ?function and use
debug to trace the error.

Without some working example it is impossible to see where is the problem.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Gabriel Toro
> Sent: Wednesday, October 6, 2021 8:32 PM
> To: r-help@r-project.org
> Subject: [R] Strange behavior of 2-d array within function
> 
> Hi,
> 
> I have a function, which defines an array of dimensions 5000 by 60,
calculates
> the values within that array and then returns the array on exit.
> 
> I get an error: Error in my_simulated[ir, 1:it] : incorrect number of
dimensions
> 
> For some strange reason, the array is somehow being changed from
>mode "numeric" and attributes $dim=6000 by 50
>to
>mode "list" and attributes NULL
> 
> This change occurs at more or less random iterations within a loop (all
within
> the same function call). I am not explicitly manipulating the mode or
> attributes of the array after it is created.
> 
> I would appreciate any suggestions on what may be causing this problem. I
> have stared at the code for a long time, run the debugger, etc.
> 
> Thanks,
> 
> Gabriel
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need in formatting data for circos plot

2021-10-05 Thread PIKAL Petr
Hm,

Maybe if you change 
> C(-7, 7)
Error in C(-7, 7) : object not interpretable as a factor

to

> c(-7, 7)
[1] -7  7

Cheers
Petr


S pozdravem | Best Regards
RNDr. Petr PIKAL
Vedoucí Výzkumu a vývoje | Research Manager
PRECHEZA a.s.
nábř. Dr. Edvarda Beneše 1170/24 | 750 02 Přerov | Czech Republic
Tel: +420 581 252 256 | GSM: +420 724 008 364
petr.pi...@precheza.cz | www.precheza.cz
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
partnerů PRECHEZA a.s. jsou zveřejněny na:
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about
processing and protection of business partner's personal data are available
on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné
a podléhají tomuto právně závaznému prohlášení o vyloučení odpovědnosti:
https://www.precheza.cz/01-dovetek/ | This email and any documents attached
to it may be confidential and are subject to the legally binding disclaimer:
https://www.precheza.cz/en/01-disclaimer/

> -Original Message-
> From: R-help  On Behalf Of pooja sinha
> Sent: Tuesday, October 5, 2021 2:20 PM
> To: r-help mailing list 
> Subject: [R] Need in formatting data for circos plot
> 
> Hi All,
> 
> I have gene expression data with differential fold change value and the
file
> looks like as below:
> Chrom start_pos end_pos value
> 14 20482867 20496901 2.713009346
> 4 123712710 123718202 -2.20797815
> 13 80883384 80896042 1.646405782
> 16 48842551 48844461 -1.636002557
> 17 28399094 28517527 1.033066311
> 9 31846044 31913462 -1.738549101
> 1 45311538 45349706 -1.360867536
> I wrote a code in R but it is giving error so I need help in trouble
> shooting:
> library(circlize)
> library(gtools)
> library(dplyr)
> circos.initializeWithIdeogram(species = "mm10")
> circos.par("track.height"=0.20)
> 
> circos.genomicTrackPlotRegion(data = db_tr,ylim = C(-7, 7), numeric.column
> = 4,
>   panel.fun = function(region,value,...) {
> cond <- value[,1] < 0.0
> circos.genomicPoints(region[cond,],
> value[cond,], pch = ".", cex = 0.1,
>  col = "khaki4")
> circos.genomicPoints(region[!cond,],
> value[!cond,], pch = ".", cex = 0.1,
>  col = "indianred")
>   })
> Error is : Error in C(-7, 7) : object not interpretable as a factor
> 
> Please help as I am new to circos plot and tried a lot. The file is also
attached
> here.
> 
> Thanks,
> Puja
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding results to plot

2021-09-17 Thread PIKAL Petr
Thanks Jim

This seems to be strightforward and quite simple. I considered addtable2plot 
but was not sure how to make propper data frame from the result.

Regards
Petr

> -Original Message-
> From: Jim Lemon 
> Sent: Friday, September 17, 2021 2:31 AM
> To: PIKAL Petr ; r-help mailing list  project.org>
> Subject: Re: [R] adding results to plot
>
> Hi Petr,
> The hard part is the names for the data frame that addtable2plot requires:
>
> set.seed(753)
> res <- shapiro.test(rnorm(100))
> library(plotrix)
> plot(0,0,type="n",axes=FALSE)
> addtable2plot(0,0,data.frame(element=names(res)[1:2],
>   value=round(as.numeric(res[1:2]),3)),xjust=0.5,
>   title=res$method)
>
> There is probably a way to get blank names with data.frame(), but I gave up.
>
> Jim
>
> On Fri, Sep 17, 2021 at 12:22 AM PIKAL Petr  wrote:
> >
> > Dear all
> >
> > I know I have seen the answer somewhere but I am not able to find it.
> > Please help
> >
> > > plot(1,1)
> > > res <- shapiro.test(rnorm(100))
> > > res
> >
> > Shapiro-Wilk normality test
> >
> > data:  rnorm(100)
> > W = 0.98861, p-value = 0.5544
> >
> > I would like to add whole res object to the plot.
> >
> > I can do it one by one
> > > text(locator(1), res$method)
> > > text(locator(1), as.character(res$p.value))
> > ...
> > But it is quite inconvenient
> >
> > I could find some way in ggplot world but not in plain plot world.
> >
> > Best regards
> > Petr
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding results to plot

2021-09-16 Thread PIKAL Petr
Thanks, 
I will try to elaborate on it.

Best regards.
Petr

> -Original Message-
> From: R-help  On Behalf Of Kimmo Elo
> Sent: Thursday, September 16, 2021 4:45 PM
> To: r-help@r-project.org
> Subject: Re: [R] adding results to plot
> 
> Hi!
> 
> Maybe with this:
> 
> text(x=0.6, y=1.2, paste0(capture.output(res), collapse="\n"), adj=0)
> 
> HTH,
> 
> Kimmo
> 
> to, 2021-09-16 kello 14:12 +, PIKAL Petr kirjoitti:
> > Virhe vahvistaessa allekirjoitusta: Virhe tulkittaessa Dear all
> >
> > I know I have seen the answer somewhere but I am not able to find it.
> > Please
> > help
> >
> > > plot(1,1)
> > > res <- shapiro.test(rnorm(100))
> > > res
> >
> > Shapiro-Wilk normality test
> >
> > data:  rnorm(100)
> > W = 0.98861, p-value = 0.5544
> >
> > I would like to add whole res object to the plot.
> >
> > I can do it one by one
> > > text(locator(1), res$method)
> > > text(locator(1), as.character(res$p.value))
> > ...
> > But it is quite inconvenient
> >
> > I could find some way in ggplot world but not in plain plot world.
> >
> > Best regards
> > Petr
> >
> > --=_NextPart_000_00C9_01D7AB15.A6E04EE0--
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding results to plot

2021-09-16 Thread PIKAL Petr
Hallo

Thanks, I will try wat option is better if yours or Kimmo's

Best regards
Petr

> -Original Message-
> From: Bert Gunter 
> Sent: Thursday, September 16, 2021 5:00 PM
> To: PIKAL Petr 
> Cc: r-help 
> Subject: Re: [R] adding results to plot
> 
> I was wrong. text() will attempt to coerce to character. This may be
> informative:
> 
> > as.character(res)
> [1] "c(W = 0.992709285275917)""0.869917232073854"
> [3] "Shapiro-Wilk normality test" "rnorm(100)"
> 
> plot(0:1, 0:1); text(0,seq(.1,.9,.2), labels = res, pos = 4)
> 
> Bert
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> On Thu, Sep 16, 2021 at 7:44 AM Bert Gunter 
> wrote:
> >
> > res is a list of class "htest" . You can only add text strings  to a
> > plot via text(). I don't know what ggplot does.
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> > On Thu, Sep 16, 2021 at 7:22 AM PIKAL Petr 
> wrote:
> > >
> > > Dear all
> > >
> > > I know I have seen the answer somewhere but I am not able to find
> > > it. Please help
> > >
> > > > plot(1,1)
> > > > res <- shapiro.test(rnorm(100))
> > > > res
> > >
> > > Shapiro-Wilk normality test
> > >
> > > data:  rnorm(100)
> > > W = 0.98861, p-value = 0.5544
> > >
> > > I would like to add whole res object to the plot.
> > >
> > > I can do it one by one
> > > > text(locator(1), res$method)
> > > > text(locator(1), as.character(res$p.value))
> > > ...
> > > But it is quite inconvenient
> > >
> > > I could find some way in ggplot world but not in plain plot world.
> > >
> > > Best regards
> > > Petr
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unable to remove NAs from a data frame

2021-09-16 Thread PIKAL Petr
Hi

You should consult either complete.cases function or to remove only rows in
which are only NAs you could use something like (untested)

df[!(colSums(is.na(df))==8),]

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Ana Marija
> Sent: Thursday, September 16, 2021 4:12 PM
> To: r-help 
> Subject: [R] unable to remove NAs from a data frame
> 
> Hi All,
> 
> I have lines in file that look like this:
> 
> > df[14509227,]
> SNP   A1   A2 freq  b se  p  N
> 1:  NA NA NA NA NA
> 
> data looks like this:
> > head(df)
>SNP A1 A2  freq   b se  p  N
> 1:  rs74337086  G  A 0.0024460  0.1627 0.1231 0.1865 218792
> 2:  rs76388980  G  A 0.0034150  0.1451 0.1047 0.1660 218792 ...
> > sapply(df,class)
> SNP  A1  A2freq   b  se
> "character" "character" "character"   "numeric"   "numeric"   "numeric"
>   p   N
>   "numeric"   "integer"
> 
> > dim(df)
> [1] 145092258
> 
> Tried:
> > df=na.omit(df)
> > dim(df)
> [1] 145092258
> 
> and:
> > library(tidyr)
> > d=df %>% drop_na()
> > dim(d)
> [1] 145092258
> 
> 
> Please advise,
> 
> Thanks
> Ana
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adding results to plot

2021-09-16 Thread PIKAL Petr
Dear all

I know I have seen the answer somewhere but I am not able to find it. Please
help

> plot(1,1)
> res <- shapiro.test(rnorm(100))
> res

Shapiro-Wilk normality test

data:  rnorm(100)
W = 0.98861, p-value = 0.5544

I would like to add whole res object to the plot.

I can do it one by one
> text(locator(1), res$method)
> text(locator(1), as.character(res$p.value))
...
But it is quite inconvenient

I could find some way in ggplot world but not in plain plot world.

Best regards
Petr
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-03 Thread PIKAL Petr
Hi Luigi.

Weird. But maybe it is the desired behaviour of summary when calculating
mean of numeric column full of NAs.

See example

dat <- data.frame(x=rep(NA, 110), y=rep(1, 110), z= rnorm(110))

# change all values in second column to NA
dat[,2] <- NA
# change some of them to NAN
dat[5:6, 2:3] <- 0/0

# see summary
summary(dat)
x y z  
 Mode:logical   Min.   : NA   Min.   :-1.9798  
 NA's:110   1st Qu.: NA   1st Qu.:-0.4729  
Median : NA   Median : 0.1745  
Mean   :NaN   Mean   : 0.1856  
3rd Qu.: NA   3rd Qu.: 0.8017  
Max.   : NA   Max.   : 2.5075  
NA's   :110   NA's   :2

# change NAN values to NA
dat[sapply(dat, is.nan)] <- NA
*

#summary is same
summary(dat)
x y z  
 Mode:logical   Min.   : NA   Min.   :-1.9798  
 NA's:110   1st Qu.: NA   1st Qu.:-0.4729  
Median : NA   Median : 0.1745  
Mean   :NaN   Mean   : 0.1856  
3rd Qu.: NA   3rd Qu.: 0.8017  
Max.   : NA   Max.   : 2.5075  
NA's   :110   NA's   :2

# but no NAN value in data
dat[1:10,]
x  y  z
1  NA NA -0.9148696
2  NA NA  0.7110570
3  NA NA -0.1901676
4  NA NA  0.5900650
5  NA NA NA
6  NA NA NA
7  NA NA  0.7987658
8  NA NA -0.5225229
9  NA NA  0.7673103
10 NA NA -0.5263897

So my "nice compact command"
dat[sapply(dat, is.nan)] <- NA

works as expected, but summary gives as mean NAN.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Luigi Marongiu
> Sent: Thursday, September 2, 2021 3:46 PM
> To: Andrew Simmons 
> Cc: r-help 
> Subject: Re: [R] How to globally convert NaN to NA in dataframe?
> 
> `data[sapply(data, is.nan)] <- NA` is a nice compact command, but I still
get
> NaN when using the summary function, for instance one of the columns give:
> ```
> Min.   : NA
> 1st Qu.: NA
> Median : NA
> Mean   :NaN
> 3rd Qu.: NA
> Max.   : NA
> NA's   :110
> ```
> I tried to implement the second solution but:
> ```
> df <- lapply(x, function(xx) {
>   xx[is.nan(xx)] <- NA
> })
> > str(df)
> List of 1
>  $ sd_ef_rash_loc___palm: logi NA
> ```
> What am I getting wrong?
> Thanks
> 
> On Thu, Sep 2, 2021 at 3:30 PM Andrew Simmons 
> wrote:
> >
> > Hello,
> >
> >
> > I would use something like:
> >
> >
> > x <- c(1:5, NaN) |> sample(100, replace = TRUE) |> matrix(10, 10) |>
> > as.data.frame() x[] <- lapply(x, function(xx) {
> > xx[is.nan(xx)] <- NA_real_
> > xx
> > })
> >
> >
> > This prevents attributes from being changed in 'x', but accomplishes the
> same thing as you have above, I hope this helps!
> >
> > On Thu, Sep 2, 2021 at 9:19 AM Luigi Marongiu 
> wrote:
> >>
> >> Hello,
> >> I have some NaN values in some elements of a dataframe that I would
> >> like to convert to NA.
> >> The command `df1$col[is.nan(df1$col)]<-NA` allows to work column-wise.
> >> Is there an alternative for the global modification at once of all
> >> instances?
> >> I have seen from
> >> https://stackoverflow.com/questions/18142117/how-to-replace-nan-
> value
> >> -with-zero-in-a-huge-data-frame/18143097#18143097
> >> that once could use:
> >> ```
> >>
> >> is.nan.data.frame <- function(x)
> >> do.call(cbind, lapply(x, is.nan))
> >>
> >> data123[is.nan(data123)] <- 0
> >> ```
> >> replacing o with NA, but I got
> >> ```
> >> str(df)
> >> > logi NA
> >> ```
> >> when modifying my dataframe df.
> >> What would be the correct syntax?
> >> Thank you
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Luigi
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> --
> Best regards,
> Luigi
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop over columns of dataframe and change values condtionally

2021-09-02 Thread PIKAL Petr
Hi

you could operate with whole data frame (sometimes)
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1  5.1 3.5  1.4 0.2  setosa
2  4.9 3.0  1.4 0.2  setosa
3  4.7 3.2  1.3 0.2  setosa
4  4.6 3.1  1.5 0.2  setosa
5  5.0 3.6  1.4 0.2  setosa
6  5.4 3.9  1.7 0.4  setosa

chenge all

> head(iris[,1:4]+10) 
  Sepal.Length Sepal.Width Petal.Length Petal.Width
1 15.113.5 11.410.2
2 14.913.0 11.410.2
3 14.713.2 11.310.2
4 14.613.1 11.510.2
5 15.013.6 11.410.2
6 15.413.9 11.710.4

change only some
> iris[,1:4][iris[,1:4]<2] <- iris[,1:4][iris[,1:4]<2]+10
> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1  5.1 3.5 11.410.2  setosa
2  4.9 3.0 11.410.2  setosa
3  4.7 3.2 11.310.2  setosa
4  4.6 3.1 11.510.2  setosa
5  5.0 3.6 11.410.2  setosa
6  5.4 3.9 11.710.4  setosa


Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Luigi Marongiu
> Sent: Thursday, September 2, 2021 3:35 PM
> To: r-help 
> Subject: [R] Loop over columns of dataframe and change values condtionally
> 
> Hello,
> it is possible to select the columns of a dataframe in sequence with:
> ```
> for(i in 1:ncol(df)) {
>   df[ , i]
> }
> # or
> for(i in 1:ncol(df)) {
>   df[ i]
> }
> ```
> And change all values with, for instance:
> ```
> for(i in 1:ncol(df)) {
>   df[ , i] <- df[ , i] + 10
> }
> ```
> Is it possible to apply a condition? What would be the syntax?
> For instance, to change all 0s in a column to NA would `df[i][df[i == 0] =
NA`
> be right?
> Thank you
> 
> 
> --
> Best regards,
> Luigi
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread PIKAL Petr
Hi

what about

data[sapply(data, is.nan)] <- NA

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Luigi Marongiu
> Sent: Thursday, September 2, 2021 3:18 PM
> To: r-help 
> Subject: [R] How to globally convert NaN to NA in dataframe?
> 
> Hello,
> I have some NaN values in some elements of a dataframe that I would like
to
> convert to NA.
> The command `df1$col[is.nan(df1$col)]<-NA` allows to work column-wise.
> Is there an alternative for the global modification at once of all
instances?
> I have seen from
> https://stackoverflow.com/questions/18142117/how-to-replace-nan-value-
> with-zero-in-a-huge-data-frame/18143097#18143097
> that once could use:
> ```
> 
> is.nan.data.frame <- function(x)
> do.call(cbind, lapply(x, is.nan))
> 
> data123[is.nan(data123)] <- 0
> ```
> replacing o with NA, but I got
> ```
> str(df)
> > logi NA
> ```
> when modifying my dataframe df.
> What would be the correct syntax?
> Thank you
> 
> 
> 
> --
> Best regards,
> Luigi
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting characters back to Date and Time

2021-09-01 Thread PIKAL Petr
Hi

You can use as.POSIXct function
https://stackoverflow.com/questions/19172632/converting-excel-datetime-seria
l-number-to-r-datetime

But you should preferably try to read the date as character vector and then
convert it to date and time.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Eliza Botto
> Sent: Tuesday, August 31, 2021 10:26 PM
> To: r-help@r-project.org
> Subject: [R] Converting characters back to Date and Time
> 
> DeaR useR,
> 
> I read an excel column in R having Date and time (written in the same
cell) as
> follow,
> 
> 06/18/18 10:00
> 
> 06/18/18 11:00
> 
> 06/18/18 12:00
> 
> In R environment, they are read as
> 
> 43269.42
> 
> 43269.46
> 
> 43269.50
> 
> Is there a way to covert these characters back to the original format?
> 
> Thank-you very much in advance.
> 
> 
> Eliza Botto
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting elements

2021-08-24 Thread PIKAL Petr
Hi.

Now it is understandable.  However the solution is not clear for me. 

table(Order$Var.1[1:10])
A B C D 
4 1 2 3 

should give you a hint which scheme could be acceptable, but how to do it 
programmatically I do not know.

maybe to start with lower value in the table call and gradually increse it to 
check which scheme starts to be the chosen one

> table(data.o$Var.1[1]) # scheme 2 is out
C 
1
...
> table(data.o$Var.1[1:5]) #scheme 3
A B C D 
1 1 2 1 

> table(data.o$Var.1[1:6]) #scheme 3

A B C D 
2 1 2 1 

> table(data.o$Var.1[1:7]) # scheme1
A B C D 
2 1 2 2 

> table(data.o$Var.1[1:8]) # no such scheme, so scheme 1 is chosen one
A B C D 
2 1 2 3 

#Now you need to select values based on scheme 1.
# 3A - 3B - 2C - 2D

sss <- split(Order, Order$Var.1)
selection <- c(3,3,2,2)
result <- vector("list", 4)

#I would use loop

for(i in 1:4) {
result[[i]] <- sss[[i]][1:selection[i],]
}

Maybe someone come with other ingenious solution.

Cheers
Petr

From: Silvano Cesar da Costa  
Sent: Monday, August 23, 2021 7:54 PM
To: PIKAL Petr 
Cc: r-help@r-project.org
Subject: Re: [R] Selecting elements

Hi,

I apologize for the confusion. I will try to be clearer in my explanation. I 
believe that with the R script it becomes clearer.

I have 4 variables with 10 repetitions and each one receives a value, randomly. 
I order the dataset from largest to smallest value. I have to select 10 
elements in 
descending order of values, according to one of three schemes:

# 3A - 3B - 2C - 2D
# 2A - 5B - 0C - 3D
# 3A - 4B - 2C - 1D

If the first 3 elements (out of the 10 to be selected) are of the letter D, 
automatically 
the adopted scheme will be the second. So, I have to (following) choose 2A, 5B 
and 0C.
How to make the selection automatically?

I created two selection examples, with different schemes:



set.seed(123)

Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)

data = data.frame(Var.1, Var.2)

(Order = data[order(data$Var.2, decreasing=TRUE), ])

# I must select the 10 highest values (), 
# but which follow a certain scheme:
#
#  3A - 3B - 2C - 2D or 
#  2A - 5B - 0C - 3D or
#  3A - 4B - 2C - 1D
#
# In this case, I started with the highest value that refers to the letter C. 
# Next comes only 1 of the letters B, A and D. All are selected once. 
# The fifth observation is the letter C, completing 2 C values. In this case, 
# following the 3 adopted schemes, note that the second scheme has 0C, 
# so this scheme is out.
# Therefore, it can be the first scheme (3A - 3B - 2C - 2D) or the 
# third scheme (3A - 4B - 2C - 1D).
# The next letter to be completed is the D (fourth and seventh elements), 
# among the 10 elements being selected. Therefore, the scheme adopted is the 
# first one (3A - 3B - 2C - 2D).
# Therefore, it is necessary to select 2 values with the letter B and 1 value 
# with the letter A.
#
# Manual Selection -
# The end result is:
(Selected.data = Order[c(1,2,3,4,5,6,7,9,13,16), ])

# Scheme: 3A - 3B - 2C - 2D
sort(Selected.data$Var.1)


#--
# Second example: -
#--
set.seed(4)

Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)

data = data.frame(Var.1, Var.2)
(Order = data[order(data$Var.2, decreasing=TRUE), ])

# The end result is:
(Selected.data.2 = Order[c(1,2,3,4,5,6,7,8,9,11), ])

# Scheme: 3A - 4B - 2C - 1D
sort(Selected.data.2$Var.1)

How to make the selection of the 10 elements automatically?

Thank you very much.

Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ciências Exatas
Departamento de Estatística

Fone: (43) 3371-4346


Em seg., 23 de ago. de 2021 às 05:05, PIKAL Petr 
<mailto:petr.pi...@precheza.cz> escreveu:
Hi

Only I got your HTML formated mail, rest of the world got complete mess. Do not 
use HTML formating.

As I got it right I wonder why in your second example you did not follow
3A - 3B - 2C - 2D

as D were positioned 1st and 4th.

I hope that you could use something like

sss <- split(data$Var.2, data$Var.1)
lapply(sss, cumsum)
$A
 [1]  38  73 105 136 166 188 199 207 209 210

$B
 [1]  39  67  92 115 131 146 153 159 164 168

$C
 [1]  40  76 105 131 152 171 189 203 213 222

$D
 [1]  37  71 104 131 155 175 192 205 217 220

Now you need to evaluate this result according to your sets. Here the highest 
value (76) is in C so the set with 2C is the one you should choose and select 
you value according to this set.

With
> set.seed(666)
> Var.1 = rep(LETTERS[1:4], 10)
> Var.2 = sample(1:40, replace=FALSE)
> data = data.frame(Var.1, Var.2)
> data <- data[order(data$Var.2, decreasing=TRUE), ]
> sss <- split(data$Var.2, data$Var.1)
> lapply(sss, cumsum)
$A
 [1]  36  70 102 133 163 182 200 207 212 213

$B
 [1]  35  57  78  95 108 120 131 140 148 150

$C
 [1]  40  73 102 130 156 180 196 211 221 225

$D
 [1]  39  77 114 141 166 189 209 223 229 232

Highest value is in D so either 3A - 3B - 2C - 2D  or 3A - 3B - 2C - 

Re: [R] Selecting elements

2021-08-23 Thread PIKAL Petr
Hi

Only I got your HTML formated mail, rest of the world got complete mess. Do not 
use HTML formating.

As I got it right I wonder why in your second example you did not follow
3A - 3B - 2C - 2D

as D were positioned 1st and 4th.

I hope that you could use something like

sss <- split(data$Var.2, data$Var.1)
lapply(sss, cumsum)
$A
 [1]  38  73 105 136 166 188 199 207 209 210

$B
 [1]  39  67  92 115 131 146 153 159 164 168

$C
 [1]  40  76 105 131 152 171 189 203 213 222

$D
 [1]  37  71 104 131 155 175 192 205 217 220

Now you need to evaluate this result according to your sets. Here the highest 
value (76) is in C so the set with 2C is the one you should choose and select 
you value according to this set.

With
> set.seed(666)
> Var.1 = rep(LETTERS[1:4], 10)
> Var.2 = sample(1:40, replace=FALSE)
> data = data.frame(Var.1, Var.2)
> data <- data[order(data$Var.2, decreasing=TRUE), ]
> sss <- split(data$Var.2, data$Var.1)
> lapply(sss, cumsum)
$A
 [1]  36  70 102 133 163 182 200 207 212 213

$B
 [1]  35  57  78  95 108 120 131 140 148 150

$C
 [1]  40  73 102 130 156 180 196 211 221 225

$D
 [1]  39  77 114 141 166 189 209 223 229 232

Highest value is in D so either 3A - 3B - 2C - 2D  or 3A - 3B - 2C - 2D should 
be appropriate. And here I am again lost as both sets are same. Maybe you need 
to reconsider your statements.

Cheers
Petr

From: Silvano Cesar da Costa  
Sent: Friday, August 20, 2021 9:28 PM
To: PIKAL Petr 
Cc: r-help@r-project.org
Subject: Re: [R] Selecting elements

Hi, thanks you for the answer. 
Sorry English is not my native language.

But you got it right. 
> As C is first and fourth biggest value, you follow third option and select 3 
> highest A, 3B 2C and 2D?

I must select the 10 (not 15) highest values, but which follow a certain order:
3A - 3B - 2C - 2D or 
2A - 5B - 0C - 3D or
3A - 3B - 2C - 2D
I'll put the example in Excel for a better understanding (with 20 elements 
only). 
I must select 10 elements (the highest values of variable Var.2), which fit one 
of the 3 options above. 

Number
Position
Var.1
Var.2








1
27
C
40








2
30
B
39

Selected: 





3
5
A
38

Number
Position
Var.1
Var.2



4
16
D
37

1
27
C
40



5
23
C
36

2
30
B
39
 
3A - 3B - 2C - 2D
6
13
A
35

3
5
A
38



7
20
D
34

4
16
D
37

3A - 3B - 1C - 3D
8
12
D
33

5
23
C
36



9
9
A
32

6
13
A
35

2A - 5B - 0C - 3D
10
1
A
31

7
20
D
34



11
21
A
30

10
9
A
32



12
35
C
29

13
14
B
28



13
14
B
28

17
6
B
25



14
8
D
27








15
7
C
26








16
6
B
25





 
 
 
17
40
D
24





 
 
 
18
26
B
23





 
 
 
19
29
A
22





 
 
 
20
31
C
21





 
 
 



Second option (other data set):

Number
Position
Var.1
Var.2








1
36
D
20








2
11
B
19

Selected: 





3
39
A
18

Number
Position
Var.1
Var.2



4
24
D
17

1
36
D
20



5
34
B
16

2
11
B
19
 
3A - 3B - 2C - 2D
6
2
B
15

3
39
A
18



7
3
A
14

4
24
D
17
 
3A - 3B - 1C - 3D
8
32
D
13

5
34
B
16



9
28
D
12

6
2
B
15

2A - 5B - 0C - 3D
10
25
A
11

7
3
A
14



11
19
B
10

8
32
D
13



12
15
B
9

9
25
A
11



13
17
A
8

10
18
C
7



14
18
C
7








15
38
B
6








16
10
B
5








17
22
B
4








18
4
D
3








19
33
A
2








20
37
A
1










How to make the selection of these 10 elements that fit one of the 3 options 
using R?

Thanks,

Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ciências Exatas
Departamento de Estatística

Fone: (43) 3371-4346


Em sex., 20 de ago. de 2021 às 03:28, PIKAL Petr 
<mailto:petr.pi...@precheza.cz> escreveu:
Hallo

I am confused, maybe others know what do you want but could you be more 
specific?

Let say you have such data
set.seed(123)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)

What should be the desired outcome?

You can sort
data <- data[order(data$Var.2, decreasing=TRUE), ]
and split the data
> split(data$Var.2, data$Var.1)
$A
 [1] 38 35 32 31 30 22 11  8  2  1

$B
 [1] 39 28 25 23 16 15  7  6  5  4

$C
 [1] 40 36 29 26 21 19 18 14 10  9

$D
 [1] 37 34 33 27 24 20 17 13 12  3

T inspect highest values. But here I am lost. As C is first and fourth biggest 
value, you follow third option and select 3 highest A, 3B 2C and 2D?

Or I do not understand at all what you really want to achieve.

Cheers
Petr

> -Original Message-
> From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of Silvano Cesar 
> da
> Costa
> Sent: Thursday, August 19, 2021 10:40 PM
> To: mailto:r-help@r-project.org
> Subject: [R] Selecting elements
> 
> Hi,
> 
> I need to select 15 elements, always considering the highest values
> (descending order) but obeying the following configuration:
> 
> 3A - 4B - 0C - 3D or
> 2A - 5B - 0C - 3D or
> 3A - 3B - 2C - 2D
> 
> If I have, for example, 5 A elements as the highest values, I can only choose
> (first and third choice) or 2 (second choice) elements.
> 
>

Re: [R] Selecting elements

2021-08-20 Thread PIKAL Petr
Hallo

I am confused, maybe others know what do you want but could you be more 
specific?

Let say you have such data
set.seed(123)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)

What should be the desired outcome?

You can sort
data <- data[order(data$Var.2, decreasing=TRUE), ]
and split the data
> split(data$Var.2, data$Var.1)
$A
 [1] 38 35 32 31 30 22 11  8  2  1

$B
 [1] 39 28 25 23 16 15  7  6  5  4

$C
 [1] 40 36 29 26 21 19 18 14 10  9

$D
 [1] 37 34 33 27 24 20 17 13 12  3

T inspect highest values. But here I am lost. As C is first and fourth biggest 
value, you follow third option and select 3 highest A, 3B 2C and 2D?

Or I do not understand at all what you really want to achieve.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Silvano Cesar da
> Costa
> Sent: Thursday, August 19, 2021 10:40 PM
> To: r-help@r-project.org
> Subject: [R] Selecting elements
> 
> Hi,
> 
> I need to select 15 elements, always considering the highest values
> (descending order) but obeying the following configuration:
> 
> 3A - 4B - 0C - 3D or
> 2A - 5B - 0C - 3D or
> 3A - 3B - 2C - 2D
> 
> If I have, for example, 5 A elements as the highest values, I can only choose
> (first and third choice) or 2 (second choice) elements.
> 
> how to make this selection?
> 
> 
> library(dplyr)
> 
> Var.1 = rep(LETTERS[1:4], 10)
> Var.2 = sample(1:40, replace=FALSE)
> 
> data = data.frame(Var.1, Var.2)
> (data = data[order(data$Var.2, decreasing=TRUE), ])
> 
> Elements = data %>%
>   arrange(desc(Var.2))
> 
> Thanks,
> 
> Prof. Dr. Silvano Cesar da Costa
> Universidade Estadual de Londrina
> Centro de Ciências Exatas
> Departamento de Estatística
> 
> Fone: (43) 3371-4346
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting different results with set.seed()

2021-08-19 Thread PIKAL Petr
Hi

Did you try different order?

Step 2: set.seed (123)

Step 1. Self-coded functions (these functions generate random numbers as well)

Step 3: Call those functions.

Step 4: model results.

Cheers
Petr.

And BTW, do not use HTML formating, it could cause problems in text only list.


From: Shah Alam 
Sent: Thursday, August 19, 2021 10:10 AM
To: PIKAL Petr 
Cc: r-help mailing list 
Subject: Re: [R] Getting different results with set.seed()

Dear Petr,

It is more than 2000 lines of code with a lot of functions and data inputs. I 
am not sure whether it would be useful to upload it. However, you are 
absolutely right. I used

Step 1. Self-coded functions (these functions generate random numbers as well)

Step 2: set.seed (123)

Step 3: Call those functions.

Step 4: model results.

I close the R session and run the code from step 1. I get different results 
for the same set of values for parameters.

Best regards,
Shah




On Thu, 19 Aug 2021 at 09:56, PIKAL Petr <mailto:petr.pi...@precheza.cz> 
wrote:
Hi

Please provide at least your code preferably with some data to reproduce
this behaviour. I wonder if anybody could help you without such information.

My wild guess is that you used

set.seed(1234)

some code

the code used again

in which case you have to expect different results.

Cheers
Petr

> -Original Message-
> From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of Shah Alam
> Sent: Thursday, August 19, 2021 9:46 AM
> To: r-help mailing list <mailto:r-help@r-project.org>
> Subject: [R] Getting different results with set.seed()
>
> Dear All,
>
> I was using set.seed to reproduce the same results for the discrete event
> simulation model. I have 12 unknown parameters for optimization (just a
> little background). I got a good fit of parameter combinations. However,
> when I use those parameters combinations again in the model. I am getting
> different results.
>
> Is there any problem with the set.seed. I assume the set.seed should
> produce the same results.
>
> I used set.seed(1234).
>
> Best regards,
> Shah
>
>   [[alternative HTML version deleted]]
>
> __
> mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting different results with set.seed()

2021-08-19 Thread PIKAL Petr
Hi

Please provide at least your code preferably with some data to reproduce
this behaviour. I wonder if anybody could help you without such information.

My wild guess is that you used 

set.seed(1234)

some code

the code used again

in which case you have to expect different results.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Shah Alam
> Sent: Thursday, August 19, 2021 9:46 AM
> To: r-help mailing list 
> Subject: [R] Getting different results with set.seed()
> 
> Dear All,
> 
> I was using set.seed to reproduce the same results for the discrete event
> simulation model. I have 12 unknown parameters for optimization (just a
> little background). I got a good fit of parameter combinations. However,
> when I use those parameters combinations again in the model. I am getting
> different results.
> 
> Is there any problem with the set.seed. I assume the set.seed should
> produce the same results.
> 
> I used set.seed(1234).
> 
> Best regards,
> Shah
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rolling 7 day incidence

2021-08-17 Thread PIKAL Petr
Hi

You're wellcome. You probably know 

https://www.repidemicsconsortium.org/projects/

as a collection of tools for epidemy evaluation.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Dr Eberhard
> Lisse
> Sent: Tuesday, August 17, 2021 2:30 PM
> To: r-help@r-project.org
> Subject: Re: [R] Rolling 7 day incidence
> 
> Petr,
> 
> thank you very much, this pointed me in the right direction (to refine my
> Google search :-)-O):
> 
>library(tidyverse)
>library(coronavirus)
>library(zoo)
> 
>as_tibble(coronavirus) %>%
>filter(country=='Namibia' & type=="confirmed") %>%
>mutate(rollsum = rollapplyr(cases, 7, sum, partial=TRUE))
> %>%
>arrange(desc(date)) %>%
>mutate(R7=rollsum / 25.4 )  %>%
>select(date,R7)
> 
> gives me something like
> 
># A tibble: 573 × 2
>date  R7
> 
> 1 2021-08-16  52.8
> 2 2021-08-15  56.1
> 3 2021-08-14  55.6
> 4 2021-08-13  63.1
> 5 2021-08-12  62.8
> 6 2021-08-11  63.7
> 7 2021-08-10  67.3
> 8 2021-08-09  69.3
> 9 2021-08-08  69.2
>10 2021-08-07  74.5
># … with 563 more rows
> 
> which seems to be correct :-)-O so I can now play with ggplot2 over the
> weekend :-)-O
> 
> greetings, el
> 
> On 17/08/2021 12:46, PIKAL Petr wrote:
> > Hi.
> >
> > There are several ways how to do it.  You could find them easily using
> > Google.  e.g.
> >
> > https://stackoverflow.com/questions/19200841/consecutive-rolling-sums-
> > in-a-vector-in-r
> >
> > where you find several options.
> >
> > Cheers
> > Petr
> [...]
> 
> 
> --
> To email me replace 'nospam' with 'el'
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rolling 7 day incidence

2021-08-17 Thread PIKAL Petr
Hi.

There are several ways how to do it. You could find them easily using Google. 
e.g.

https://stackoverflow.com/questions/19200841/consecutive-rolling-sums-in-a-vector-in-r

where you find several options.

Cheers
Petr



> -Original Message-
> From: R-help  On Behalf Of Dr Eberhard
> Lisse
> Sent: Tuesday, August 17, 2021 12:25 PM
> To: r-help@r-project.org
> Subject: [R] Rolling 7 day incidence
> 
> Hi,
> 
> I am loading the coronavirus dataset everyday which looks something like:
> 
> 
>as_tibble(coronavirus) %>%
>filter(country=="Namibia" & type=="confirmed") %>%
>arrange(desc(date)) %>%
>print(n=10)
> 
># A tibble: 573 × 7
>date   province country   lat  long type  cases
>   
> 1 2021-08-16 ""   Namibia -23.0  18.5 confirmed76
> 2 2021-08-15 ""   Namibia -23.0  18.5 confirmed   242
> 3 2021-08-14 ""   Namibia -23.0  18.5 confirmed   130
> 4 2021-08-13 ""   Namibia -23.0  18.5 confirmed   280
> 5 2021-08-12 ""   Namibia -23.0  18.5 confirmed   214
> 6 2021-08-11 ""   Namibia -23.0  18.5 confirmed96
> 7 2021-08-10 ""   Namibia -23.0  18.5 confirmed   304
> 8 2021-08-09 ""   Namibia -23.0  18.5 confirmed   160
> 9 2021-08-08 ""   Namibia -23.0  18.5 confirmed   229
>10 2021-08-07 ""   Namibia -23.0  18.5 confirmed   319
># … with 563 more rows
> 
> How do I do a rolling 7 day incidence (ie sum the cases over 7 days) but
> rolling, ie from the last day to 7 (or 6?)  days before the end of the 
> dataset, so
> I get pairs of date/7-Day-Incidence?
> 
> I know it's probably re-inventing the plot as it were but I can't find R code 
> to
> do that.
> 
> I want to plot it per 10 but that I can do.
> 
> greetings, el
> 
> 
> --
> To email me replace 'nospam' with 'el'
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sanity check in loading large dataframe

2021-08-09 Thread PIKAL Petr
Hi Bert

Yes, in this case which is not necessary. But in case NAs are involved 
sometimes logical indexing is not a best choice as NA propagates to the 
result, which may be not wanted.

x <- 1:10
x[c(2,5)] <- NA
y<- letters[1:10]
y[x<5]
[1] "a" NA  "c" "d" NA
y[which(x<5)]
[1] "a" "c" "d"
dat <- data.frame(x,y)
dat[x<5,]
  xy
1 1a
NA   NA 
3 3c
4 4d
NA.1 NA 

> dat[which(x<5),]
  x y
1 1 a
3 3 c
4 4 d

Both results are OK, but one has to consider this NA value propagation.

Cheers
Petr

From: Bert Gunter 
Sent: Friday, August 6, 2021 1:29 PM
To: PIKAL Petr 
Cc: Luigi Marongiu ; r-help 
Subject: Re: [R] Sanity check in loading large dataframe

... but remove the which() and use logical indexing ...  ;-)


Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Aug 6, 2021 at 12:57 AM PIKAL Petr <mailto:petr.pi...@precheza.cz> 
wrote:
Hi

You already got answer from Avi. I often use dim(data) to inspect how many
rows/columns I have.
After that I check if some columns contain all or many NA values.

colSums(http://is.na(data))
keep <- which(colSums(http://is.na(data)) -Original Message-
> From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of Luigi 
> Marongiu
> Sent: Friday, August 6, 2021 7:34 AM
> To: Duncan Murdoch <mailto:murdoch.dun...@gmail.com>
> Cc: r-help <mailto:r-help@r-project.org>
> Subject: Re: [R] Sanity check in loading large dataframe
>
> Ok, so nothing to worry about. Yet, are there other checks I can
implement?
> Thank you
>
> On Thu, 5 Aug 2021, 15:40 Duncan Murdoch, <mailto:murdoch.dun...@gmail.com>
> wrote:
>
> > On 05/08/2021 9:16 a.m., Luigi Marongiu wrote:
> >  > Hello,
> >  > I am using a large spreadsheet (over 600 variables).
> >  > I tried `str` to check the dimensions of the spreadsheet and I got
> > > ```  >> (str(df))  > 'data.frame': 302 obs. of  626 variables:
> >  >   $ record_id : int  1 1 1 1 1 1 1 1 1 1 ...
> >  > 
> >  > $ v1_medicamento___aceta: int  1 NA NA NA NA NA NA NA NA NA ...
> >  >[list output truncated]
> >  > NULL
> >  > ```
> >  > I understand that `[list output truncated]` means that there are
> > more  > variables than those allowed by str to be displayed as rows.
> > Thus I  > increased the row's output with:
> >  > ```
> >  >
> >  >> (str(df, list.len=1000))
> >  > 'data.frame': 302 obs. of  626 variables:
> >  >   $ record_id : int  1 1 1 1 1 1 1 1 1 1 ...
> >  > ...
> >  > NULL
> >  > ```
> >  >
> >  > Does `NULL` mean that some of the variables are not closed?
> > (perhaps a  > missing comma somewhere)  > Is there a way to check the
> > sanity of the data and avoid that some  > separator is not in the
> > right place?
> >  > Thank you
> >
> > The NULL is the value returned by str().  Normally it is not printed,
> > but when you wrap str in parens as (str(df, list.len=1000)), that
> > forces the value to print.
> >
> > str() is unusual in R functions in that it prints to the console as it
> > runs and returns nothing.  Many other functions construct a value
> > which is only displayed if you print it, but something like
> >
> > x <- str(df, list.len=1000)
> >
> > will print the same as if there was no assignment, and then assign
> > NULL to x.
> >
> > Duncan Murdoch
> >
>
>   [[alternative HTML version deleted]]
>
> __
> mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sanity check in loading large dataframe

2021-08-06 Thread PIKAL Petr
Hi 

You already got answer from Avi. I often use dim(data) to inspect how many
rows/columns I have.
After that I check if some columns contain all or many NA values.

colSums(is.na(data))
keep <- which(colSums(is.na(data)) -Original Message-
> From: R-help  On Behalf Of Luigi Marongiu
> Sent: Friday, August 6, 2021 7:34 AM
> To: Duncan Murdoch 
> Cc: r-help 
> Subject: Re: [R] Sanity check in loading large dataframe
> 
> Ok, so nothing to worry about. Yet, are there other checks I can
implement?
> Thank you
> 
> On Thu, 5 Aug 2021, 15:40 Duncan Murdoch, 
> wrote:
> 
> > On 05/08/2021 9:16 a.m., Luigi Marongiu wrote:
> >  > Hello,
> >  > I am using a large spreadsheet (over 600 variables).
> >  > I tried `str` to check the dimensions of the spreadsheet and I got
> > > ```  >> (str(df))  > 'data.frame': 302 obs. of  626 variables:
> >  >   $ record_id : int  1 1 1 1 1 1 1 1 1 1 ...
> >  > 
> >  > $ v1_medicamento___aceta: int  1 NA NA NA NA NA NA NA NA NA ...
> >  >[list output truncated]
> >  > NULL
> >  > ```
> >  > I understand that `[list output truncated]` means that there are
> > more  > variables than those allowed by str to be displayed as rows.
> > Thus I  > increased the row's output with:
> >  > ```
> >  >
> >  >> (str(df, list.len=1000))
> >  > 'data.frame': 302 obs. of  626 variables:
> >  >   $ record_id : int  1 1 1 1 1 1 1 1 1 1 ...
> >  > ...
> >  > NULL
> >  > ```
> >  >
> >  > Does `NULL` mean that some of the variables are not closed?
> > (perhaps a  > missing comma somewhere)  > Is there a way to check the
> > sanity of the data and avoid that some  > separator is not in the
> > right place?
> >  > Thank you
> >
> > The NULL is the value returned by str().  Normally it is not printed,
> > but when you wrap str in parens as (str(df, list.len=1000)), that
> > forces the value to print.
> >
> > str() is unusual in R functions in that it prints to the console as it
> > runs and returns nothing.  Many other functions construct a value
> > which is only displayed if you print it, but something like
> >
> > x <- str(df, list.len=1000)
> >
> > will print the same as if there was no assignment, and then assign
> > NULL to x.
> >
> > Duncan Murdoch
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cumulates of snowfall within a given interval

2021-07-30 Thread PIKAL Petr
Hi

I would use ?embed function. 

nr <- which(rowSums(embed(mydf$hn, 2))>=80)
mydf[nr,]

But I feel strange that variant 40,50 should be accepted but 0, 90 should not. 
Both after two consecutive days result in more than 80cm cumulative snow. What 
about 1,80 how it differs from 0, 81. basically with your constrain zero means 
NA so if I change all zeroes to NA I get quite close.

> nr <- which(rowSums(embed(mydf$hn, 2))>=80)
> mydf[nr,]
   data_POSIX hn
6  2018-02-06 40
13 2018-02-13 50
> nr <- which(rowSums(embed(mydf$hn, 4))>=100)
> mydf[nr,]
   data_POSIX hn
12 2018-02-12 10
13 2018-02-13 50
> nr <- which(rowSums(embed(mydf$hn, 2))>=100)
> mydf[nr,]
  data_POSIX hn
6 2018-02-06 40
>

Anyway, you need to polish the result a bit as in four consecutive days both 12 
and 13 results in higher than 100 cm

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Stefano Sofia
> Sent: Friday, July 30, 2021 9:24 AM
> To: r-help mailing list 
> Subject: [R] Cumulates of snowfall within a given interval
> 
> Dear R users,
> I have a data frame with daily snow cumulates (these quantities are known
> as "hn" and are expressed in cm), from the 1st of December to the 30th of
> April, for more than twenty years.
> 
> I would need to find days when the sum of a given short interval (I might
> choose two consecutive days, three consecutive days or something like that)
> is higher than a threshold (it might be 80 cm, or 100 cm).
> 
> I am trying with rle, but I really struggle to find an efficient algorithm.
> Could somebody help me with some hints?
> 
> Thank you for your attention and your help
> Stefano
> 
> 
> init_day <- as.POSIXct("2018-02-01", format="%Y-%m-%d", tz="Etc/GMT-1")
> fin_day <- as.POSIXct("2018-02-20", format="%Y-%m-%d", tz="Etc/GMT-1")
> mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="1 day"))
> mydf$hn <- c(30, 0, 10, 50, NA, 40, 70, 0, 0, 0 , NA, 10, 50, 30, 30, 10, 0, 
> 0, 90, 0)
> 
> - if I choose a threshold of 100 cm in two days, I should get the 6th of
> February;
> - if I choose a threshold of 80 cm in two days I should get the 6th and the 
> 13th
> of February, but not the 19th of February because this is a single day;
> - f I choose a threshold of 100 cm in four days, I should get the 12th of
> February.
> 
>  (oo)
> --oOO--( )--OOo--
> Stefano Sofia PhD
> Civil Protection - Marche Region - Italy
> Meteo Section
> Snow Section
> Via del Colle Ameno 5
> 60126 Torrette di Ancona, Ancona (AN)
> Uff: +39 071 806 7743
> E-mail: stefano.so...@regione.marche.it
> ---Oo-oO
> 
> 
> 
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere
> informazioni confidenziali, pertanto è destinato solo a persone autorizzate
> alla ricezione. I messaggi di posta elettronica per i client di Regione Marche
> possono contenere informazioni confidenziali e con privilegi legali. Se non 
> si è
> il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
> questo
> messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al
> mittente ed eliminarlo completamente dal sistema del proprio computer. Ai
> sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità 
> ed
> urgenza, la risposta al presente messaggio di posta elettronica può essere
> visionata da persone estranee al destinatario.
> IMPORTANT NOTICE: This e-mail message is intended to be received only by
> persons entitled to receive the confidential information it may contain. 
> E-mail
> messages to clients of Regione Marche may contain information that is
> confidential and legally privileged. Please do not read, copy, forward, or 
> store
> this message unless you are an intended recipient of it. If you have received
> this message in error, please forward it to the sender and delete it
> completely from your computer system.
> 
> --
> Questo messaggio  stato analizzato da Libraesva ESG ed  risultato non infetto.
> This message was scanned by Libraesva ESG and is believed to be clean.
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   10   >