Hi Vanessa,

Thanks for your explanation.  I'm also compiling a TFBS dataset by using 
the Chip-Seq data from ENCODE project. I have tried to combine the data 
from the five major groups, but it's not an easy task for me.

I have found a set of clustered TFBSs from Kent's lab on the website ( 
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRegTfbsClustered/).
 
Seem this is exactly what I need. Could you please tell me where I can 
find detailed description of this dataset, such as how the TFBSs 
clustered, what's the meaning of the scores in the bed file, etc ?

Thanks,
Shuli

On 04/10/2012 10:30 AM, Vanessa Kirkup Swing wrote:
> Hi Anyuan,
>
> The additional tracks that you seeing in the table browser are in the
> genome browser and are grouped under the ENC TF Binding Super-track (
> http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&c=chr21&g=wgEncodeTfBindingSuper
> )
>
> To see all the available tracks on the genome browser there is a tool
> called Track Search. You can get to track search from the gateway page (
> http://genome.ucsc.edu/cgi-bin/hgGateway).  Select the assembly you are
> interested in and then click on "track search".
>
> With regards to intersecting data between experiments, we are are unable to
> give you advice on that.
>
>  From the track description page for the hg19 TFBS Conserved Track (
> http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&c=chr21&g=tfbsConsSite)
> here is what was done that is different from hg18:
>
> "These data were obtained by running the program tfloc (Transcription
> Factor binding site LOCater) on multiz46way alignments, restricting only to
> the July 2007 (mm9) mouse genome assembly, the November 2004 rat assembly
> (rn4), and the February 2009 human genome assembly (hg19). Transcription
> factor information was culled from the Transfac Factor database, version
> 7.0."
>
> Here is what is different for the hg18 TFBS Conserved Track (
> http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr21&g=tfbsConsSites):
>
> "These data were obtained by running the program tfloc (Transcription
> Factor binding site LOCater) on multiz alignments of the February 2006
> (mm8) mouse genome assembly and the November 2004 rat assembly (rn4) to the
> March 2006 human genome assembly (hg18.) Transcription factor information
> was culled from the Transfac Factor database, version 7.0."
>
> These differences would explain why there are larger amounts of data for
> hg19.
>
> I hope that answers your questions. If you have further questions, please
> email the list: [email protected].
>
> Vanessa Kirkup Swing
> UCSC Genome Bioinformatics Group
>
>
> ---------- Forwarded message ----------
> From: 郭安源<[email protected]>
> Date: 2012/4/10
> Subject: [Genome] ask about the CHIP-seq and tfbs data for human hg19
> To: [email protected]
>
>
> Dear Sir/Madam,
>        I am trying to use the human CHIP-seq data on UCSC and now I have
> several questions about it.
>        From the hg19 browser page, the only one chip-seq data in the
> regulation tracks is the "ENCODE regulation tracks", which includes the
> "Txn Factor ChIP" track. However, from the Table browser download page, we
> can find several other CHIP TFBS tracks, such as HAIB TFBS and UTA TFBS etc.
>       Is it because that the Txn Factor ChIP track includes all the data of
> others? So if I need the most comprehensive CHIP data, should I donwload
> only the Txn Factor ChIP track data or also download other TFBS data in the
> table browser page?
>       I noticed that one track has many experiments for the same TF, such as
> for the Nrsf TF in the HAIB TFBS track, there are the following
> experiments. For these, should I use the intersection data to reduce false
> positve for the Nrsf tfbs?
> wgEncodeHaibTfbsGm12878NrsfPcr2xPkRep1.broadPeak
> wgEncodeHaibTfbsGm12878NrsfPcr2xPkRep2.broadPeak
> wgEncodeHaibTfbsH1hescNrsfV0416102PkRep1.broadPeak
> wgEncodeHaibTfbsH1hescNrsfV0416102PkRep2.broadPeak
> wgEncodeHaibTfbsHelas3NrsfPcr1xPkRep1.broadPeak
> wgEncodeHaibTfbsHelas3NrsfPcr1xPkRep2.broadPeak
> wgEncodeHaibTfbsHepg2NrsfPcr2xPkRep1.broadPeak
> wgEncodeHaibTfbsHepg2NrsfPcr2xPkRep2.broadPeak
> wgEncodeHaibTfbsK562NrsfV0416102PkRep1.broadPeak
> wgEncodeHaibTfbsK562NrsfV0416102PkRep2.broadPeak
> wgEncodeHaibTfbsPanc1NrsfPcr2xPkRep1.broadPeak
> wgEncodeHaibTfbsPanc1NrsfPcr2xPkRep2.broadPeak
> wgEncodeHaibTfbsPfsk1NrsfPcr2xPkRep1.broadPeak
> wgEncodeHaibTfbsPfsk1NrsfPcr2xPkRep2.broadPeak
> wgEncodeHaibTfbsSknshNrsfPcr2xPkRep1.broadPeak
> wgEncodeHaibTfbsSknshNrsfPcr2xPkRep2.broadPeak
> wgEncodeHaibTfbsU87NrsfPcr2xPkRep1.broadPeak
> wgEncodeHaibTfbsU87NrsfPcr2xPkRep2.broadPeak
>
> For  the conserved TFBS prediction in the "TFBS Conserved" track, I noticed
> ther are much more data than the data downloaded from hg18 previously.
> However, the description page of this track (
> http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=255151677&c=chr21&g=tfbsConsSites),
> it seem no different from the hg18 page, which said using the TransFac 7.0
> matrix and the same program. If that, why much more tfbs were predicted? I
> guess you use the new version of TransFac matrix but didn't update the
> description page, right?
> These data are very important for us. I am looking forward for your reply.
> Thanks very much.
> Best,
> Anyuan Guo
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to