Hi Fabricia,

To create a merged interval that spans the 7k upstream flank interval, the original interval, and the 7k downstream flank interval, do the following:

Starting with the two files you already have:

1 - original intervals (extracted from blat hits)
2 - flank results from the query:
"Get Flanks"
"Region:" Whole feature
"Location of the flanking region/s:" Both
"Offset" 0
"Length of the flanking region(s):" 7000

Put both datasets into a single dataset using the tool: "Operate on Genomic Intervals -> Concatenate", Both datasets are same filetype?: checked.

On that result file, Merge the intervals together using the tool: "Operate on Genomic Intervals -> Merge".

If your original blat hits have any overlap, or the flanks your are generating have any overlap with any of your other intervals (original or other flanks), then this is probably not going to give you the results you want.

In that case, it may just be simpler to just modify the coordinates using "Text manipulation" tools. Specifically, "Compute an expression on every row", run twice, once with the expression "c2 - 7000" and once with "c3 + 7000" (this is subtracting 7000 from the start, adding 7000 to the end). Then use "Cut" to recreate the interval file using the new values as start and end.

Hopefully one of these will work for you.

Jen
Galaxy team


On 5/3/12 6:38 AM, Fabricia Nascimento wrote:
Hi Jen,

Thanks a lot for your reply. But I think you misuderstood my question. I
will reformulate it given examples.

I have initially (because I am doing just preliminary analysis) *_70
blat hits_* corresponding to different coordinates in the pig genome.
What I would like to have is the flanking region in both direction
between these blat hits. I am not working with gene (or introns and exons).

For example:

Imagine that this symbol ########## corrsponds to my blat hit
and this symbol -------------------- corresponds to flanking regions

I have initially ##########
and I would like to obtain -------------------- ##########
--------------------

In numbers:

I have: chr146496908464969603

I would like to have chr146496208464976603
(This will correspont to the first coorditane minus 7000 and the last
coordinate plus 7000)

What I got using "Get Flanks" and using the parameters
"Region:" Whole feature
"Location of the flanking region/s:" Both
"Offset" 0
"Length of the flanking region(s):" 7000

chr14   64962084        64969084
chr14   64969603        64976603


Is there a way of merging the above coorditates to come with what I need?

Thanks a lot for your help,

All the best,
Fabricia.


------------------------------------------------------------------------
*De:* Jennifer Jackson <j...@bx.psu.edu>
*Para:* Fabricia Nascimento <nasciment...@yahoo.com.br>
*Cc:* "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu>
*Enviadas:* Quinta-feira, 3 de Maio de 2012 4:25
*Assunto:* Re: [galaxy-user] Get flanks (version 1.0.0)

Hello Fabricia,

You are probably running the tool like this, correct? This lumps the
upstream flank and downstream flank ends to create one interval:

"Region:" Whole feature
"Location of the flanking region/s:" Both
"Offset" 0
"Length of the flanking region(s):" 7000

Instead, run the tool in twice to extract upstream and downstream
regions into distinct intervals:

Run 1
"Region:" Whole feature
"Location of the flanking region/s:" Upstream
"Offset" 0
"Length of the flanking region(s):" 7000

Run 2
"Region:" Whole feature
"Location of the flanking region/s:" Downstream
"Offset" 0
"Length of the flanking region(s):" 7000

If your question has been misunderstood, please let us know,

Best,

Jen
Galaxy team


On 5/2/12 5:51 PM, Fabricia Nascimento wrote:
 > HI,
 >
 > I am very new to genomic data analysis and I need to get some upstream
 > and downstream of some chromosome regions of the pig genome. I have
 > about 70 blat hits of a query of ca 100aa. I need to get 7000
 > nucleotides both upstream and downstream of this 100aa region.
 > I have tried to use Get flanks to get the "new" coordinates... bus
 > instead of generating coordinates which would correspond to about 14000
 > nucleotides, it generates one coordinate for the upstream region and
 > them another one for the downstream region.
 > Is there a way of doing what I need using Galaxy?
 >
 > I would appreciate any help!
 >
 > Thanks a lot!
 >
 > All the best,
 > Fabricia.
 >
 >
 > ___________________________________________________________
 > The Galaxy User list should be used for the discussion of
 > Galaxy analysis and other features on the public server
 > at usegalaxy.org. Please keep all replies on the list by
 > using "reply all" in your mail client. For discussion of
 > local Galaxy instances and the Galaxy source code, please
 > use the Galaxy Development list:
 >
 > http://lists.bx.psu.edu/listinfo/galaxy-dev
 >
 > To manage your subscriptions to this and other Galaxy lists,
 > please use the interface at:
 >
 > http://lists.bx.psu.edu/

--
Jennifer Jackson
http://galaxyproject.org



--
Jennifer Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to