Re: [galaxy-user] hg19 and hg19patch2

2011-06-10 Thread Church, Deanna (NIH/NLM/NCBI) [E]
The patches are just representing alternate paths (not all are truly
haplotypic). Some of these represent corrections to the underlying
chromosome assembly. Basically, regions where the chromosome tiling path
is wrong. We release the fixes ahead of the next build to make them
accessible to folks.

Deanna


On 6/10/11 8:43 AM, "Will McLaren"  wrote:

>Hi David,
>
>You can find information about the assemblies here:
>
>http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/index.shtml
>
>The patches so far have just included extra regions representing
>alternative haplotype regions (e.g. MHC).
>
>Ensembl 62 was released on patch 3:
>
>http://www.ensembl.org/Homo_sapiens/Info/Index
>
>If your data uses only to the reference chromosomes then you should
>have no issues using hg19 or any of the patches released so far.
>
>Cheers
>
>Will McLaren
>Ensembl Variation
>
>On 10 June 2011 12:21, David Matthews  wrote:
>> Dear Galaxy-users,
>> Does anyone know what the differences are between hg19 and hg19patch2
>>and
>> can anyone tell me if the latest ensembl gtf file (v62) is definitely
>> compatible with both hg19 and hg19patch2?
>>
>>
>> Best Wishes,
>> David.
>> __
>> Dr David A. Matthews
>> Senior Lecturer in Virology
>> Room E49
>> Department of Cellular and Molecular Medicine,
>> School of Medical Sciences
>> University Walk,
>> University of Bristol
>> Bristol.
>> BS8 1TD
>> U.K.
>> Tel. +44 117 3312058
>> Fax. +44 117 3312091
>> d.a.matth...@bristol.ac.uk
>>
>>
>>
>>
>>
>> On 10 Jun 2011, at 11:39, Michal Stuglik wrote:
>>
>>
>> Hi Jen,
>>
>> It works, thanks!
>>
>> I am wondering why using Text Manipulation/Compute function, galaxy
>>changes
>> brackets '[' to '__ob__' and '__cb__' for ']', so for this:
>>str(c1)[1:2] -->
>> str(c1)__ob__1:2__cb__
>>
>> thanks a lot,
>> michal
>>
>> Hi Michal,
>>
>> The tool "Fetch Sequences -> Extract Genomic DNA" can be used to extract
>> fasta sequences. The coordinates can be BED, GTF, etc. and the "genome"
>> doesn't necessarily have to be an actual genome, just a fasta file in
>>your
>> history.
>>
>> To subset a data string, the tool "Text Manipulation -> Trim" might be
>> helpful. This would only work if you want to use the same rules for an
>> entire file (or split your file up and run the tool on those subfiles
>>using
>> different rules). Practical for some cases, but not all.
>>
>> And the final option is for coordinate data - tools in "Operate on
>>Genomic
>> Intervals". Once you have the final coordinate set, going back and
>>using the
>> "Fetch Sequences" tool can capture the associated result fasta sequence,
>> from a native genome or a fasta file in your history, as described
>>above.
>>
>> Hopefully this gives you an option that will work for your project,
>>
>> Best,
>>
>> Jen
>> Galaxy team
>>
>> On 6/5/11 7:14 AM, Michal Stuglik wrote:
>>
>> Hi all,
>>
>> I am wondering if galaxy has tool to substring/extract sequence/text
>> from another sequence/text based on coordinates in columns (start, end
>> column) or how to do it in Text Manipulation/Compute?
>>
>> all the best,
>> michal
>>
>>
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>  http://lists.bx.psu.edu/
>>
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>  http://lists.bx.psu.edu/
>>
>
>___
>The Galaxy User list should be used for the discussion of
>Galaxy analysis and other features on the public server
>at usegalaxy.org.  Please keep all replies on the list by
>using "reply all" in your mail client.  For discussion of
>local Galaxy instances and the Galaxy source code, please
>use the Galaxy Development list:
>
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>
>To manage your subscriptions to this and other Galaxy lists,
>please use the interface at:
>
>  http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion of
Galaxy 

Re: [galaxy-user] hg19 and hg19patch2

2011-06-10 Thread Francis Ouellette
not sure about compatibility EnsEMBL gtf files, but differences between
the various patches are represented here:

http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/index.shtml

Human
 genome reference now at Patch 4 (which was news to me until I went
to this page, thx!).

f.

--
B.F. Francis Ouellette http://oicr.on.ca/research/ouellette/



On 2011-06-10, at 8:21 AM, David Matthews wrote:

Dear Galaxy-users,

Does anyone know what the differences are between hg19 and hg19patch2 and can 
anyone tell me if the latest ensembl gtf file (v62) is definitely compatible 
with both hg19 and hg19patch2?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 10 Jun 2011, at 11:39, Michal Stuglik wrote:



Hi Jen,

It works, thanks!

I am wondering why using Text Manipulation/Compute function, galaxy changes 
brackets '[' to '__ob__' and '__cb__' for ']', so for this: str(c1)[1:2] --> 
str(c1)__ob__1:2__cb__

thanks a lot,
michal

Hi Michal,

The tool "Fetch Sequences -> Extract Genomic DNA" can be used to extract fasta 
sequences. The coordinates can be BED, GTF, etc. and the "genome" doesn't 
necessarily have to be an actual genome, just a fasta file in your history.

To subset a data string, the tool "Text Manipulation -> Trim" might be helpful. 
This would only work if you want to use the same rules for an entire file (or 
split your file up and run the tool on those subfiles using different rules). 
Practical for some cases, but not all.

And the final option is for coordinate data - tools in "Operate on Genomic 
Intervals". Once you have the final coordinate set, going back and using the 
"Fetch Sequences" tool can capture the associated result fasta sequence, from a 
native genome or a fasta file in your history, as described above.

Hopefully this gives you an option that will work for your project,

Best,

Jen
Galaxy team

On 6/5/11 7:14 AM, Michal Stuglik wrote:

Hi all,

I am wondering if galaxy has tool to substring/extract sequence/text
from another sequence/text based on coordinates in columns (start, end
column) or how to do it in Text Manipulation/Compute?

all the best,
michal


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] hg19 and hg19patch2

2011-06-10 Thread Will McLaren
Hi David,

You can find information about the assemblies here:

http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/index.shtml

The patches so far have just included extra regions representing
alternative haplotype regions (e.g. MHC).

Ensembl 62 was released on patch 3:

http://www.ensembl.org/Homo_sapiens/Info/Index

If your data uses only to the reference chromosomes then you should
have no issues using hg19 or any of the patches released so far.

Cheers

Will McLaren
Ensembl Variation

On 10 June 2011 12:21, David Matthews  wrote:
> Dear Galaxy-users,
> Does anyone know what the differences are between hg19 and hg19patch2 and
> can anyone tell me if the latest ensembl gtf file (v62) is definitely
> compatible with both hg19 and hg19patch2?
>
>
> Best Wishes,
> David.
> __
> Dr David A. Matthews
> Senior Lecturer in Virology
> Room E49
> Department of Cellular and Molecular Medicine,
> School of Medical Sciences
> University Walk,
> University of Bristol
> Bristol.
> BS8 1TD
> U.K.
> Tel. +44 117 3312058
> Fax. +44 117 3312091
> d.a.matth...@bristol.ac.uk
>
>
>
>
>
> On 10 Jun 2011, at 11:39, Michal Stuglik wrote:
>
>
> Hi Jen,
>
> It works, thanks!
>
> I am wondering why using Text Manipulation/Compute function, galaxy changes
> brackets '[' to '__ob__' and '__cb__' for ']', so for this: str(c1)[1:2] -->
> str(c1)__ob__1:2__cb__
>
> thanks a lot,
> michal
>
> Hi Michal,
>
> The tool "Fetch Sequences -> Extract Genomic DNA" can be used to extract
> fasta sequences. The coordinates can be BED, GTF, etc. and the "genome"
> doesn't necessarily have to be an actual genome, just a fasta file in your
> history.
>
> To subset a data string, the tool "Text Manipulation -> Trim" might be
> helpful. This would only work if you want to use the same rules for an
> entire file (or split your file up and run the tool on those subfiles using
> different rules). Practical for some cases, but not all.
>
> And the final option is for coordinate data - tools in "Operate on Genomic
> Intervals". Once you have the final coordinate set, going back and using the
> "Fetch Sequences" tool can capture the associated result fasta sequence,
> from a native genome or a fasta file in your history, as described above.
>
> Hopefully this gives you an option that will work for your project,
>
> Best,
>
> Jen
> Galaxy team
>
> On 6/5/11 7:14 AM, Michal Stuglik wrote:
>
> Hi all,
>
> I am wondering if galaxy has tool to substring/extract sequence/text
> from another sequence/text based on coordinates in columns (start, end
> column) or how to do it in Text Manipulation/Compute?
>
> all the best,
> michal
>
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>  http://lists.bx.psu.edu/
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>  http://lists.bx.psu.edu/
>

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/