Hi

I am not concerned whether I use biomart 0.7 or 0.8 - whichever is easiest for what I would like to do. I havent done anything yet and I'm starting from scratch.

All i want to do is have a go at re-creating the ensembl mart from the ensembl core databases. I wanted to do this because ensembl is an example of a database whose schema I am familiar with and whose mart I have used. I wanted to do this for 2 reasons:
a) to get some practice
b) to get an intuitition of what type of mart I can create from my own database schema and what types of query I can run and what the filters/attributes will be c) get an idea of how i could integrate my database with ensembl as I believe they only need to share ids or underlying assembly to b integrated

Will i be able to recreate the ensembl mart in biomart 0.8? I presume the ensembl xml files are available for 0.7 and I won;t be able to read them in 0.8? Without these files how will i know the exact steps ensembl used to specify their mart structure? How will i know what main tables they chose or how for example they created the PRINTS dimension table mentioned in my original query?

Thanks a lot


On 12/03/2011 18:59, Arek Kasprzyk wrote:
Putting this back on the list to keep everyone else in the loop

a



On 2011-03-12, at 13:56, "Arek Kasprzyk"<[email protected]>
wrote:

If you are starting from scratch it would be much better to start with
0.8 rc5. Creating new mart is as simple as choosing one or more main
tables in the source schema. You can choose different tables and
create different datasets. There is some documentation about it in
rc5. If you want to know how the transformation algorithm works I can
describe that to you too


a



On 2011-03-12, at 12:53, "Andrea Edwards"<[email protected]>
wrote:

ok - thanks

i don't know much about biomart as you can probably tell but i was
told
there are quite significant differences between 0.7 and 0.8.
If i am interested in understanding how the schema transformations
take
place so that I can design my own mart and integrate it with existing
marts, would i be better dropping back to 0.7? I'm keen to get a
mart up
and running very soon.

On 12/03/2011 17:41, Arek Kasprzyk wrote:
0.8 rc 5 has still only rudimentary support for the MBuilder
component. You will not be able to read 0.7 mbuilder XML with it.
(ccing junjun who  has just taken over the coordination of the
BioMart
development to let him know that such discussions are taking place)

a



On 2011-03-12, at 12:28, "Andrea Edwards"<[email protected]>
wrote:

Brilliant - thanks for such a prompt reply.

I note that you say MBuilder (0.7) whereas i have checked out the
code
for biomart 0.8 rc4


On 12/03/2011 16:39, Arek Kasprzyk wrote:
Hi Andrea
All the transformation information is stored in the XML file that
MBuilder (0.7) uses to compile it's DDL for Ensembl core
databases. I
am sure the ensembl mart team will be happy to provide you the
latest
version

a



On 2011-03-12, at 11:15, "Andrea Edwards"<[email protected]>
wrote:

Hello

I was wondering if there were any documents showing how the
ensembl
marts were created from the main ensembl databases.
Specifically i
was
hoping there were documents describing what tables were selected
as
main
tables for the marts and how the dimension tables were mapped to
the
main tables.

As an example the ensembl_mart_61 contains a main table for human
named
translation_main (this is an abbreviation of the name but its
obvious
which one i mean) and this has a field called
protein_feature_prints_bool which is essentially a boolean field
indicating whether a protein translation is assocated with a row
in
the
PRINTS dimension table protein_feature_prints_dm. If the
translation
does have a row in this dimension table then I am guessing it
has a
PRINTS domain in it!

The core database itself however has a table called translation
which
represents, well, a translation. Translations are linked to rows
in a
table called 'protein_feature' which in turn has a foreign key
called
analysis_id which links to an 'analysis' table with fields
'database'
and 'program'. So in this schema, a translation is associated
with a
PRINTS annotation if it is linked to a 'protein_feature' record
which is
in turn linked to an 'analysis' record with the text 'PRINTS'
somewhere
in both/either the database/program fields.

I am interested in how the biomart software is configured with
'rules'
to create the mart schema from the database schema. Is there a
configuration file with these rules in that I could look at? Is
there a
worked example? As an academic exercise I'd like to recreate the
ensembl
marts. I have the biomart user manual but even with that document
I do
not know how to recreate the ensembl marts

I am NOT specifically interested in protein domains. I used the
PRINTS
example purely for illustrative purposes as I thought it was a
straightforward example. I am interested in how you specify the
'rules'
to get from a schema to a mart.

thanks a lot

_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to