Hi,

I have been trying to map the deep-sequenced data to mouse genome.  There is a 
~15 nt sequence (about 15000 copies) which is mapping in reverse orientation to 
the intron region of a gene. I have downloaded the intron database from Tables 
in ucsc browser without repeat masking option. But when I check the repeat 
masking option, the mapped regions of the above sequences are being masked 
suggesting that they might be repeat sequences. However, when I map them to the 
mouse repeat database obtained from the www.girinst.org and also to the 
www.repeatmasker.org, that particular sequence is not shown as a repeat. But it 
is shown as a repeat in Hydra genome. Can someone suggest me what I am missing 
here. Why ucsc browser considering that sequence as a repeat and the others are 
not? 

thanks in advance.
________________________________________
From: [email protected] [[email protected]] On 
Behalf Of [email protected] [[email protected]]
Sent: Wednesday, June 02, 2010 12:16 PM
To: [email protected]
Subject: Genome Digest, Vol 89, Issue 3

Send Genome mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.soe.ucsc.edu/mailman/listinfo/genome
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Genome digest..."


Today's Topics:

   1. Re: mask coding region (Jennifer Jackson)
   2. Re: custom track (Jennifer Jackson)
   3. Re: [Genome-mirror] Masked Genome Strand (Jennifer Jackson)
   4. Re: [Genome-mirror] PDF file to Power Point (Maximilian Haussler)
   5. cpg island locations across the genome (Carlo Colantuoni)
   6. Kent source tree (quinn)
   7. ensGene and ucscToEnsembl (Oliver Lui)
   8. MySql error (kunchaparty,Shanti)
   9. Custom Track Question (Shashikant Pujar)


----------------------------------------------------------------------

Message: 1
Date: Tue, 01 Jun 2010 12:50:47 -0700
From: Jennifer Jackson <[email protected]>
Subject: Re: [Genome] mask coding region
To: Vera Pendino <[email protected]>
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Update:

One of our scientific engineers reminded me that we have tools in the
source tree to help with masking:
http://genomewiki.cse.ucsc.edu/index.php/Kent_source_utilities

* maskOutFa (takes .bed file w/coords)
* twoBitMask

Download the source here:
http://hgdownload.cse.ucsc.edu/downloads.html
     scroll down to "Source Downloads"  ->
     "UCSC Genome Browser source download"

These utilities are not in the set of pre-compiled utilities on the
downloads server, so you will need to follow the instructions in the
READMEs and linked help documents.

Perhaps this will be help you avoid having to create your own tool(s)
for the masking step, if you decide to try this method.

Best wishes,
Jen

On 6/1/10 9:57 AM, Jennifer Jackson wrote:
> Hello Vera,
>
> Yes, this is possible, but you will need to obtain the reference genome
> sequence, coordinates that you want to mask, do the masking, then run
> BLAT on your own server against the newly created file.
>
> (using hg19 as an example in links, if using hg18, swap in that database
> for the links).
>
> FTP sequence:
> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/
>
> Obtain CDS coordinates:
> Use Table browser or Downloads server, a Gene Prediction track (UCSC
> Genes, RefSeq Genes, CCDS, etc.), and output or ftp the CDS coordinates.
>
>       Table browser (good even if using FTP to learn table
>       names/fields. See track descriptions to review methods
>       and select proper dataset for your purposes).
>       http://genome.ucsc.edu/cgi-bin/hgTables
>       http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html
>
>       Ftp complete files (representing mySQL tables):
>       http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/
>
>
> BLAT:
> http://genome.ucsc.edu/FAQ/FAQblat.html
>
> Hopefully this will help you to get started, please let us know if you
> need more help,
>
> Jennifer
>
> ---------------------------------
> Jennifer Jackson
> UCSC Genome Informatics Group
> http://genome.ucsc.edu/
>
> On 6/1/10 8:58 AM, Vera Pendino wrote:
>> Hi,
>> I would like to run blat with a short sequence on the regions that are 
>> annotated as intronic,  intergenic and UTRs in  the human genome(hg19).
>> In other words, I'd like to know if it is possible to mask the coding 
>> regions of the genome.
>> Could you help me?
>> thank you
>>
>> Vera
>>
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome


------------------------------

Message: 2
Date: Tue, 01 Jun 2010 13:30:34 -0700
From: Jennifer Jackson <[email protected]>
Subject: Re: [Genome] custom track
To: Dorit Shweiki <[email protected]>
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hello Dorit,

We are sorry, but bedGraph format does not allow a per-item identifier.
It is positional data only.

Some help:
http://genome.ucsc.edu/goldenPath/help/bedgraph.html
http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format

Thanks,
Jennifer

---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/

On 6/1/10 3:27 AM, Dorit Shweiki wrote:
>
>
> Hello,
>
>
>
> I created a file which contains 2 tracks
>
> The first BED shows genes and their position
>
> The second BEDGraph shows expression level.
>
> How can I add the gene name or geneid to the second track - where do I
> put it?
>
>
>
>
>
> browser position chr2:1-189,746,636
>
> browser hide all
>
> track name="Grade_A_position" description="rhesus grade A probes
> position by gene" visibility=1
>
> chr2       1511924                2110200                LOC719197
> 500         +
>
> chr2       3039794                3574082                LOC722879
> 500         -
>
> chr2       10025405             10345631             LOC719312
> 500         +
>
> chr2       11199466             11252352             LOC719328
> 500         -
>
> track type=bedGraph name="Dev_express" description="up and down gene
> expressed in ESCs" visibility=full color=200,100,0 altColor=0,100,200
> priority=20
>
> chr2       11199466             11252352             -0.694189
>
> chr2       12369959             12379310             -0.179144
>
> chr2       12384400             12394747             0.055239
>
>
>
>
>
>
>
> Thank you in advance
>
> Best regards
>
> Dorit
>
>
>
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome


------------------------------

Message: 3
Date: Tue, 01 Jun 2010 13:53:05 -0700
From: Jennifer Jackson <[email protected]>
Subject: Re: [Genome] [Genome-mirror] Masked Genome Strand
To: [email protected]
Cc: [email protected], [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hello,

Yes, the reference genome fasta sequence represents the forward (+) strand.

For "all fasta sequences" this is not true. Those that represent
annotation (such as transcripts, example: RefSeq) can be from either
strand. This type of fasta sequence represents a transcript in the
direction of transcription (5'->3'). In most cases, the primary table of
the source track related to the transcript fasta sequence has the
reference genome alignment coordinates (including strand).

Hopefully this helps,
Thanks,
Jennifer

---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/

On 5/31/10 2:19 PM, [email protected] wrote:
>
> Hi,
> I have downloaded the masked genome as a ref genome to align our reads.I am 
> just
> wondering the sequences in the fasta masked.fa files, which strand is that? 
> Are
> all the fasta sequences are in the forward strand? please let me know.
>
> thanks
> -dafil
>
> _______________________________________________
> Genome-mirror mailing list
> [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome-mirror


------------------------------

Message: 4
Date: Wed, 2 Jun 2010 10:22:12 +0100
From: Maximilian Haussler <[email protected]>
Subject: Re: [Genome] [Genome-mirror] PDF file to Power Point
To: Mariaestela Ortiz <[email protected]>
Cc: genome <[email protected]>
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1

Hi Maria,

when I prepared figures for an article, I've used Adobe Illustrator (or
inkscape, free software) and imported the pdf as line (vector) graphics,
then played around with the image until I was happy with it. The advantage
is that you can remove parts of the line drawing, move them around, increase
font sizes etc. You can then copy-paste in the end into  powerpoint as a
vector graphics so the resultion should be very good...

hope that helps
Max


On Sun, May 30, 2010 at 9:03 PM, Mariaestela Ortiz <[email protected]> wrote:

> Hello There, I would like to copy and paste various linear gene maps that
> depict the exons (blue color) into a Power Point Slide. I am preparing a
> talk for a conference. Please send me any hints or help on how to do this.
> I tried via PDF but the resolution is very low.
>
> Any help would be much appreciated.
>
> All the best,
>
> Maria
> _______________________________________________
> Genome-mirror mailing list
> [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome-mirror
>


------------------------------

Message: 5
Date: Tue, 1 Jun 2010 20:58:28 -0400
From: "Carlo Colantuoni" <[email protected]>
Subject: [Genome] cpg island locations across the genome
To: <[email protected]>
Cc: 'Carlo Colantuoni' <[email protected]>
Message-ID: <006301cb01ee$b733a350$259ae9...@com>
Content-Type: text/plain;       charset="us-ascii"

hi there,



i am wondering if ucsc genome browser has a database of mapped cpg islands
across the genome (such as a track in the browser that I could download)? i
am most interested in rat, but would like to look at human and mouse too. i
am interested in downloading all the locations, not just searching one gene
for cpg islands.



thanks,

carlo



------------------------------

Message: 6
Date: Tue, 1 Jun 2010 23:31:17 -0400
From: quinn <[email protected]>
Subject: [Genome] Kent source tree
To: [email protected]
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1

Hi UCSC help group,

I am trying to converse axt files to MAF files by axtToMaf program, but I
can't find where I can download axtToMaf. There is a link from the mailing
list, but it's too old and doesn't work anymore. Could you tell me where I
can download axtToMaf and other kent sources? Any help will be highly
appreciated!

Best,
Quinn


------------------------------

Message: 7
Date: Wed, 2 Jun 2010 10:10:24 +0000
From: Oliver Lui <[email protected]>
Subject: [Genome] ensGene and ucscToEnsembl
To: <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"


Hi there

I tried to download the ensGene table in the GRCh37/ hg19 database in GTF 
format, but the gene ids and the transcript ids are always the same, i.e. start 
with "ENST". I think the gene ids should start with "ENSG"?

Also, I've downloaded the latest human GTF file (Homo_sapiens.GRCh37.58.gtf) 
from the Ensembl website, but I couldn't find the corresponding ucsc ids for 
some of the Ensembl ids (at the bottom of the file), e.g. LRG_15, from the 
ucscToEnsembl table. Any suggestion about what I could do?

Thanks!

Regards
Oliver

_________________________________________________________________
http://clk.atdmt.com/UKM/go/197222280/direct/01/
Do you have a story that started on Hotmail? Tell us now

------------------------------

Message: 8
Date: Tue, 1 Jun 2010 16:45:43 -0500
From: "kunchaparty,Shanti" <[email protected]>
Subject: [Genome] MySql error
To: "[email protected]" <[email protected]>
Message-ID:
        
<2ff752fb5881994783fae68e5ff64700236109a...@dcpwvmbxc1vs2.mdanderson.edu>

Content-Type: text/plain; charset="us-ascii"

We are unable to connect to the mysql server using the following command:
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A

Is the server down? Thanks

Regards
Shanti
------------------------------------------
Shanti Kunchaparty, PhD
Scientific Application Specialist
Research IS and Technology
Phone: 713-792-1863
MD Anderson Cancer Center
* Please consider the environment before printing this e-mail





------------------------------

Message: 9
Date: Wed, 2 Jun 2010 10:03:34 -0400
From: Shashikant Pujar <[email protected]>
Subject: [Genome] Custom Track Question
To: "[email protected]" <[email protected]>
Message-ID:
        <29456825ba12254791bfe870d2b28b501d7c646...@mbxc.exchange.cornell.edu>
Content-Type: text/plain; charset="us-ascii"

Hi

I have loaded a custom track (Illumina NGS reads in BAM format of a 2Mb region) 
on the UCSC Dog Genome Browser.  Is there a way I can extract all SNPs and 
Indels between the custom track and canFam2?

Thanks

Shashi Pujar


------------------------------

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome


End of Genome Digest, Vol 89, Issue 3
*************************************

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to