Re: [BioMart Users] Problem building 3NF virtual mart

Thomas TRIPLET Wed, 05 Oct 2011 13:05:53 -0700

Hi Arek,

Thanks for your message, that's definitely  very helpful. Still got a
question about the target database as I am trying to think in the longer
term. At some point, as more data comes in, I will probably have to
materialize the virtual mart as recommended in the documentation, so that
queries are executed in a reasonable amount of time. I'm not there yet, but
at that point, it will be the target database that will be used (since it is
the one that is materialized). So my initial question regarding the target
schema still holds for longer term perspectives. If MartConfigurator can't
control how the target schema is generated, is it possible to manually edit
the underlying registry file so that the source and target schema match?


Also, this might sound a bit stupid, but how do I activate the logs for the
web UI. I've set DEBUG to ON in *dist/biomart.properties*, but the *
dist/logs* folder doesn't get populated (I've restarted the servers).

Thanks!
Thomas

--
Thomas Triplet, Ph.D.
http://www.thomastriplet.net

Centre for Structural and Functional Genomics
Concordia University
7141 West Sherbrooke St
Montreal QC H4B 1R6





On Wed, Oct 5, 2011 at 1:45 PM, Arek Kasprzyk <[email protected]>wrote:

> Hi Thomas,
> sorry, i did not make myself clear. The GUI gets configured as if it was
> talking to the target schema. However, the QueryComplier queries the source
> not target. The target simply does not physically exist at this point unless
> you materilize it. If you want to make sure that the source is being queried
> you can switch the logging on to see the actual SQL that is being compiled.
> As far as the GUI is concerned you can use MConfigurator to give it any
> shape of form that you require. It does not have to look like an image of
> the target schema. This is just a default behaviour to get you started
>
> a
>
>
> On Wed, Oct 5, 2011 at 1:25 PM, Thomas TRIPLET <[email protected]>wrote:
>
>> Hi Arek,
>>
>> I tried to add only the tables from my previous example and started the
>> servers without materializing them.
>> It seems that the QueryCompiler does *not* ignore the target schema (cf
>> screenshot: http://ttriplet.fg.concordia.ca/literatureTrial.png).
>>
>> As far as I can tell with such a simple example, querying works in the
>> sense that data is retrieved, but the forms and filters that must be
>> populated for querying are not intuitive nor practical.
>>
>> Also, it might be a related fact/error, when I import the 4 tables
>> altogether, 2 data sources are created (one called *literature*, one *
>> person*). I noticed this usually happens when the FKs in the source
>> schema are not properly defined, but this is not the case here.
>>
>> Also, the log gave me the following when I imported the tables:
>>
>> org.biomart.common.exceptions.ValidationException: You can have at most
>> one each of 1:M and M:1 subclass relations from the same table, but no more.
>>   at
>> org.biomart.objects.objects.Relation.setSubclassRelation(Relation.java:447)
>>   at
>> org.biomart.configurator.controller.MartController.continueSubclassing(MartController.java:1562)
>>   at
>> org.biomart.configurator.controller.MartController.suggestMarts(MartController.java:1367)
>>   at
>> org.biomart.configurator.controller.MartController.requestCreateMartsFromSource(MartController.java:1599)
>>   at
>> org.biomart.configurator.controller.ObjectController.initMarts(ObjectController.java:98)
>>   at
>> org.biomart.configurator.view.component.container.SourceGroupPanel$1.construct(SourceGroupPanel.java:85)
>>   at org.biomart.common.view.gui.SwingWorker$2.run(SwingWorker.java:128)
>>   at java.lang.Thread.run(Thread.java:679)
>>
>> So it looks like the problem is that one table has two 1:M relations
>> (probably the table used to define the M:N relation between articles and
>> authors).
>>  Thanks a lot for your help.
>>
>> --
>> Thomas Triplet, Ph.D.
>> http://www.thomastriplet.net
>>
>> Centre for Structural and Functional Genomics
>> Concordia University
>> 7141 West Sherbrooke St
>> Montreal QC H4B 1R6
>>
>>
>>
>>
>>
>> On Wed, Oct 5, 2011 at 12:31 PM, Arek Kasprzyk 
>> <[email protected]>wrote:
>>
>>> Hi Thomas,
>>> What happens if you launch the server and query it? If you do not
>>> materilize, the QueryCompiler will ignore the target schema and translate
>>> SQL against the source instead.
>>>
>>> Could you try querying it to see if you are getting the results back?
>>>
>>> a
>>>
>>>
>>>
>>> On Wed, Oct 5, 2011 at 11:51 AM, Thomas TRIPLET <[email protected]
>>> > wrote:
>>>
>>>> Hi Arek,
>>>>
>>>> This is because my source schema is normalized. When added as a RDBMS
>>>> data source in BioMart, it is processed to build the target schema. The
>>>> problem is that the target schema that is generated is denormalized and
>>>> unusable. Here are 2 screenshots to illustrate the issue:
>>>>
>>>>    - http://ttriplet.fg.concordia.ca/sourceDB.png
>>>>    - http://ttriplet.fg.concordia.ca/targetDB.png
>>>>
>>>> I hope this clarifies my problem.
>>>>
>>>> Thanks
>>>> Thomas
>>>>
>>>> --
>>>> Thomas Triplet, Ph.D.
>>>> http://www.thomastriplet.net
>>>>
>>>> Centre for Structural and Functional Genomics
>>>> Concordia University
>>>> 7141 West Sherbrooke St
>>>> Montreal QC H4B 1R6
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Oct 5, 2011 at 11:39 AM, Arek Kasprzyk <[email protected]
>>>> > wrote:
>>>>
>>>>> Hi Thomas,
>>>>> you lost me here. Why would you want to use the same schema for source
>>>>> and target? Could you elaborate on this please?
>>>>>
>>>>> a
>>>>>
>>>>> On Wed, Oct 5, 2011 at 11:24 AM, Thomas TRIPLET <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Syed,
>>>>>> Thanks for your reply. Sorry I wasn't clear enough, I am trying to use
>>>>>> the data source as it is, not materializing it as a mart. But when I add 
>>>>>> the
>>>>>> data source as a Postgres RDBMS, MartConfigurator *automatically* uses
>>>>>> the source schema to build the target schema. Is it possible to disable 
>>>>>> this
>>>>>> feature and force MartConfigurator to use the same schema for the target
>>>>>> database as for the source database?
>>>>>> Thanks
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thomas Triplet, Ph.D.
>>>>>> http://www.thomastriplet.net
>>>>>>
>>>>>> Centre for Structural and Functional Genomics
>>>>>> Concordia University
>>>>>> 7141 West Sherbrooke St
>>>>>> Montreal QC H4B 1R6
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 5, 2011 at 9:07 AM, Syed Haider <[email protected]> wrote:
>>>>>>
>>>>>>>
>>>>>>> Thomas,
>>>>>>>
>>>>>>> Converting your source database into a mart is not mandatory, its
>>>>>>> only recommended for performance reasons. Try using the source schema &
>>>>>>> create a dataset straight out of your source database (without 
>>>>>>> materialising
>>>>>>> to mart) and the subsequent querying and web interfaces should work just
>>>>>>> fine (at least in theory).
>>>>>>>
>>>>>>> W.r.t your second question, Martconfigurator is not able to read GFF
>>>>>>> or other text file formats. The input to Martconfigurator should either 
>>>>>>> be
>>>>>>> an existing biomart webserver end point or a database.
>>>>>>>
>>>>>>> HTH,
>>>>>>> Syed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 04/10/2011 22:03, Thomas TRIPLET wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I'm trying to build a mart (v0.8rc6) from a normalized (3NF)
>>>>>>>> PostgreSQL database. So far, it looks really great, except for one 
>>>>>>>> issue I
>>>>>>>> have. When I look at the source database schema (using 
>>>>>>>> MartConfigurator), it
>>>>>>>> is just fine, so the database seems properly imported. However, the
>>>>>>>> corresponding target database really messes things up as it tries to 
>>>>>>>> merge
>>>>>>>> everything together. Here is a simple example to illustrate the 
>>>>>>>> problem:
>>>>>>>>
>>>>>>>> [cid:ii_132d0b1d058ca382]  [cid:ii_132d0b234e4777ae]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The target database is clearly problematic, and not usable as such.
>>>>>>>> Is there a way to force BioMart to keep the source schema untouched, 
>>>>>>>> or at
>>>>>>>> least, control how it is processed ? ALl relations in the source 
>>>>>>>> database
>>>>>>>> are defined as [1:M] so I'm not sure what's wrong here.
>>>>>>>>
>>>>>>>> Also, it is my understanding that a database must exist in the form
>>>>>>>> of a database in order to be imported (whether it is actually accessed 
>>>>>>>> using
>>>>>>>> RDBMS/URL/registry), or can it also be defined as raw datafiles (e.g 
>>>>>>>> GFF
>>>>>>>> files) along with a parser. Could you please confirm this?
>>>>>>>>
>>>>>>>> Thanks a lot for your help.
>>>>>>>>
>>>>>>>> Thomas
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thomas Triplet, Ph.D.
>>>>>>>> http://www.thomastriplet.net
>>>>>>>>
>>>>>>>> Centre for Structural and Functional Genomics
>>>>>>>> Concordia University
>>>>>>>> 7141 West Sherbrooke St
>>>>>>>> Montreal QC H4B 1R6
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> [email protected]
>>>>>> https://lists.biomart.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Re: [BioMart Users] Problem building 3NF virtual mart

Reply via email to