I did this query (through both martview and martservice):
?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query virtualSchemaName = "default" Header = "1" count = ""
softwareVersion = "0.5" >
<Dataset name = "hsapiens_gene_vega" interface =
"default" >
<Attribute name = "transcript_stable_id" />
<Attribute name = "gene_stable_id" />
<Filter name = "chr_name" value = "Y"/>
<Attribute name = "str_chrom_name" />
<Attribute name = "exon_stable_id" />
<Attribute name = "exon_chrom_start" />
<Attribute name = "exon_chrom_end" />
<Attribute name = "exon_chrom_strand" />
<Attribute name = "gene_exon" />
<Filter name = "downstream_flank" value =
"300"/>
<Filter name = "upstream_flank" value = "300"/>
</Dataset>
</Query>
When I export TSV, this all works just fine. When I use
formatter="FASTA", though, I get sequences from chromosome Y AND X!
Using MT as the chromosome I get no results, which I think is correct.
It also appears that if I put chrom_start=1300000 and chrom_end=1500000
into the file, I get only Y's (even though there are genes in those
positions on the X).
Any thoughts?
The additional sequences from X are likely to be a part of a PAR region
therefore containing
duplicates from Y so the data should be actually correct. However it
sounds a little bit odd though
that they are there with FASTA but not with TSV. We'll look into this
and get back to you
a.
-Amir Karger
------------------------------------------------------------------------
-------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
------------------------------------------------------------------------
-------