Dear R Users,

I have started to compile some useful hacks for the generation of nice descriptive statistics. I hope that these functions & hacks are useful to the wider R community. I hope that package developers also get some inspiration from the code or from these ideas.


I have started to review various packages focused on descriptive statistics - although I am still at the very beginning.


### Hacks / Code
- split table headers in 2 rows;
- split results over 2 rows: view.gtsummary(...);
- add abbreviations as footnotes: add.abbrev(...);

The results are exported as a web page (using shiny) and can be printed as a pdf documented. See the following pdf example:

https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.Example_1.pdf


### Example
# currently focused on package gtsummary
library(gtsummary)
library(xml2)

mtcars %>%
    # rename2():
    # - see file Tools.Data.R;
    # - behaves in most cases the same as dplyr::rename();
    rename2("HP" = "hp", "Displ" = disp, "Wt (klb)" = "wt", "Rar" = drat) %>%
    # as.factor.df():
    # - see file Tools.Data.R;
    # - encode as (ordered) factor;
    as.factor.df("cyl", "Cyl ") %>%
    # the Descriptive Statistics:
    tbl_summary(by = cyl) %>%
    modify_header(update = header) %>%
    add_p() %>%
    add_overall() %>%
    modify_header(update = header0) %>%
    # Hack: split long statistics !!!
    view.gtsummary(view=FALSE, len=8) %>%
    add.abbrev(
        c("Displ", "HP", "Rar", "Wt (klb)" = "Wt"),
        c("Displacement (in^3)", "Gross horsepower", "Rear axle ratio",
        "Weight (1000 lbs)"));


The required functions are on Github:
https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.R


The functions rename2() & as.factor.df() are only data-helpers and can be found also on Github:
https://github.com/discoleo/R/blob/master/Stat/Tools.Data.R


Note:

1.) The function add.abbrev() operates on the generated html-code:

- the functionality is more generic and could be used easily with other packages that export web pages as well;

2.) Split statistics: is an ugly hack. I plan to redesign the functionality using xml-technologies. But I have already too many side-projects.

3.) as.factor.df(): traditionally, one would create derived data-sets or add a new column with the variable as factor (as the user may need the numeric values for further analysis). But it looked nicer as a single block of code.


Sincerely,


Leonard

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to