[R] splitting a dataframe in R based on multiple gene names in a specific column

Bogdan Tanasa Tue, 22 Aug 2017 16:57:48 -0700

I would appreciate please a suggestion on how to do the following :

i'm working with a dataframe in R that contains in a specific column
multiple gene names, eg :


> df.sample.gene[15:20,2:8]
     Chr     Start       End Ref Alt Func.refGene
Gene.refGene284 chr2  16080996  16080996   C   T ncRNA_exonic
       GACAT3448 chr2 113979920 113979920   C   T ncRNA_exonic
LINC01191,LOC100499194465 chr2 131279347 131279347   C   G
ncRNA_exonic              LOC440910525 chr2 223777758 223777758   T
A       exonic                  AP1S3626 chr3  99794575  99794575   G
 A       exonic                 COL8A1643 chr3 132601066 132601066   A
  G       exonic                  ACKR4

How could I obtain a dataframe where each line that has multiple gene names
(in the field Gene.refGene) is replicated with only one gene name ? i.e.

for the second row :

  448 chr2 113979920 113979920   C   T ncRNA_exonic LINC01191,LOC100499194

we shall get in the final output (that contains all the rows) :

  448 chr2 113979920 113979920   C   T ncRNA_exonic LINC01191
  448 chr2 113979920 113979920   C   T ncRNA_exonic LOC100499194

thanks a lot !

-- bogdan

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] splitting a dataframe in R based on multiple gene names in a specific column

Reply via email to