Hello Davide,
You've probably already seen the command line usage statement, which you
get whenever you run the program with no command line arguments or with
incorrect command line arguments. Just for discussion though, here
it is:
wigCorrelate - Produce a table that correlates all pairs of wigs.
usage:
wigCorrelate one.wig two.wig ... n.wig
This works on bigWig as well as wig files.
The output is to stdout
options:
-clampMax=N - values larger than this are clipped to this value
It works by finding items that overlap in the different wig files.
Within the overlap it considers each base a separate
observation and calculates Pearson's R based on that. Here's two
relevant snippets of the code from lib/correlate.c
void correlateNext(struct correlate c, double x, double y)
/ Add next sample to correlation. */ {
c->sumX += x;
c->sumXX += x*x;
c->sumXY += x*y;
c->sumY += y;
c->sumYY += y*y;
c->n += 1;
}
double correlateResult(struct correlate c)
/ Returns correlation (aka R) */ {
double r = 0;
if (c->n > 0) {
double sp = c->sumXY - c->sumX*c->sumY/c->n;
double ssx = c->sumXX - c->sumX*c->sumX/c->n;
double ssy = c->sumYY - c->sumY*c->sumY/c->n;
double q = ssx*ssy;
if (q != 0)
r = sp/sqrt(q);
}
return r;
}
It has a little optimization that lets it work faster than this when it
knows it has a multiple-base window where x and y are constant that
keeps the base-by-base approach from actually getting expensive when
it's not really needed.
Unfortunately it doesn't do anything particularly sensible with the
regions where there is data in one wig but not another. It's treated the
same as something that wasn't covered by either wig.
Best regards,
Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
On 04/13/11 00:59, Davide Cittaro wrote:
> Hi all,
> I've just realized wigCorrelate tool exists... Would it be possible to have
> more information about it? How correlation is calculated?
>
> d
>
> /*
> Davide Cittaro, PhD
> [email protected]
>
> Center for Genomic Science of IIT@SEMM
> Via Adamello 16, 20139 Milan
> t: +39 02 574303007
> f: +39 02 57489937
>
> http://genomics.iit.it/
> */
>
>
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome