> I'm heaving difficulties with a dataset containing gene names and 
positions
> of those genes.
> Not such a big problem, but each gene has multiple exons so it's hard to 
say
> where de gene starts and where it ends. I want the starting and ending
> position of each gene in my dataset.
> Attached is the dataset:
> http://www.nabble.com/file/p21312449/genlistchrompos.csv 
genlistchrompos.csv 
> Column 'B' is the gene name, 'G' is the starting position and 'H' is the
> stop position.
> You can load the dataset by using: data<-read.csv("genlistchrompos.csv",
> sep=";")
> I hope someone can help me, it's giving me headaches for a week now:-((.

which(diff(as.numeric(data$Gene))!=0)

will give you a vector of the last row in each gene.  The start position 
is obviously the next row after the previous end.

Also take a look at 

split(data, data$Gene)

Regards,
Richie.

Mathematical Sciences Unit
HSL


------------------------------------------------------------------------
ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to