Hi Andrea, BioMart has a built in algorithm for 3NF -> mart schema (reverse star) transformation (we'll be publishing this algorithm in our paper to be submitted shortly). The system requires user input to instruct it which tables should be used a main tables but the rest is done automatically. The system will correctly transform any schema complying with 3NF into reverse star.
The virtual mart is definitely quicker to set up but will be slower for querying than it's materialized counterpart for large datasets. The virtual mart option is recommended for quick prototyping or for small datasets. The materialized mart just like materialized view in the relational database offers benefits of query optimization. You can create virtual mart from a remote server but in order to materialize it both source schema and materialized mart will have to be on the same server. (the materialization process relies on the DDL statements that involve both source and materialized and need to be executed on the same server). Hope this helps, a Arek Kasprzyk Director, Bioinformatics Operations and Principal Investigator Ontario Institute for Cancer Research MaRS Centre, South Tower 101 College Street, Suite 800 Toronto, Ontario, Canada M5G 0A3 Tel: 416-673-8559 Toll-free: 1-866-678-6427 www.oicr.on.ca Administrative Assistant: [email protected] This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization. From: Andrea Edwards <[email protected]<mailto:[email protected]>> Date: Fri, 4 Mar 2011 14:34:02 -0500 To: Joachim Baran <[email protected]<mailto:[email protected]>> Cc: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: [BioMart Users] installing biomart and memory requirements Thanks for your reply. I have a few more questions which will probably be obvious once I've made my own mart but for now are puzzling me. How does the system 'know' how to rewrite the schema? For example, how does it know which tables to use as the central 'fact' tables (i read them called focus tables in an old ensmart paper). I'm wondering how it is possible for any database schema to be compatible. This might be obvious when you have seen it done. I'd like to be in a position where , if a new database is published, i can be sure i can add it to my existing mart regardless of its schema. I appreciate this might not be a simple answer but a concrete example would be really useful if possible. Is a physical mart quicker than a virtual mart? I presume that is the benefit of materializing over not materializing Do i have to have a virtual mart if i use a machine on a remote server or can materialization get the data from the remote database If it does do all of these things then I'll definitely be using it! On 04/03/2011 16:52, Joachim Baran wrote: Hey! I CC the mailing list here, so other people can benefit from the conversation too. Hope that is alright. On 11-03-04 11:15 AM, "Andrea Edwards"<[email protected]<mailto:[email protected]>> wrote: I believe biomart is capable of producing one query-optimized system from this data. Is this correct? Yes. The query-optimised system is generated when you select a source, right-click, and then select 'Materialize'. This will take your local data-sources and rewrite them for query optimisation. You do not have to do that though -- you can run the system with your databases as they are. This is what we call a 'Virtual Mart'. Will there be one database that incorporates all this data on my machine at the end of it? If you materialise your databases, all the local data-sources will be put in one database. Do all the databases that I wish to incorporate have to be on the same machine? I'm guessing not if its a federated data model. You can use pointed attributes to incorporate data from other machines. For example, you could mesh-up your data with Ensembl's marts if your like. What happens if the schema changes for one of the databases, do i have to rebuild the whole lot? Depends. You need to run 'Update' on the data-source that has changed (again, right-click on the data-source). By doing so, MartConfigurator will pick up on changes in the schema, such as added/deleted columns in your database. If you run a virtual mart, just hit save and re-deploy the mart. If you run a materialised mart, materialise it again and you are ready to go. I also believe biomart has tools for automatically generating a web interface too. Yes. In fact, when you hit deploy in MartConfigurator, a Jetty server will launch in the background and after a few seconds a browser window will pop-up pointing to your deployed mart. The web-interface comes "for free". No need to configure anything, really. I believe queries on this interface will automatically generate perl and java code to query the resource directly You got Perl + Java in BioMart 0.7. In BioMart 0.8 you can generate Java from queries, or XML that can be used to run automated queries via our RESTful interface, and there will be a SPARQL interface soon. Perhaps you are more familiar with the old BioMart 0.7 marts. You can have a look at the new BioMart 0.8 here: http://dcc.icgc.org/ Joachim _______________________________________________ Users mailing list [email protected]<mailto:[email protected]> https://lists.biomart.org/mailman/listinfo/users
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
