just replied to it :) On Tue, Sep 6, 2011 at 5:50 PM, Junjun Zhang <[email protected]>wrote:
> Hi Arek, > > Can you please elaborate more on the follow point mentioned in your > previous response? > > "There is one more situation involving inlist that has not > been solved either ei annotating external identifiers that effectively is > an inner join as well and has been brought to our attention before." > > Thanks, > Junjun > > > From: Arek Kasprzyk <[email protected]> > Date: Tue, 6 Sep 2011 13:29:01 -0400 > To: jzhang <[email protected]> > Cc: "[email protected]" <[email protected]>, "[email protected]" < > [email protected]> > > Subject: Re: [BioMart Users] OUTER JOIN between tables > > Junjun: > my three cents ;) > > 1. I would definitely NOT expose end users to understand the querying logic > for inter mart queries. > 2. left join with inter mart querying needs to be solved with in list > annotation at the same time > 3. The 'on the fly' left join was tested two years ago when i asked for the > query compiler to be extended so we know well how it performs :) > > a. > > > > > > > On Tue, Sep 6, 2011 at 12:53 PM, Junjun Zhang <[email protected]>wrote: > >> Hi Nuria, >> >> Your use case is well understood. As left-join is a more typical scenario >> for inter-mart queries, we are considering to make left-join the default >> behaviour in the next release. Technically this is not hard to implement, >> however, we'd like to be very careful about making this type of decision as >> it has global impact to the whole querying system. In this case, we will >> need to provide means to enable deployers (maybe end users as well) to alter >> the default join behaviour from the web GUI and the query XML. >> >> For join with a dataset between main table and dimension table, we can >> continue to support precomputed 'left join' as Arek mentioned below. We can >> also make it an optional for performing 'left join' on the fly. The only >> concern is that under certain situations the latter may have some >> performance disadvantages. >> >> We will keep you informed on this. >> >> Hope this helps! >> >> Best regards, >> Junjun >> >> >> From: Arek Kasprzyk <[email protected]> >> Date: Mon, 5 Sep 2011 09:55:42 -0400 >> To: "[email protected]" <[email protected]> >> Cc: "[email protected]" <[email protected]> >> Subject: Re: [BioMart Users] OUTER JOIN between tables >> >> Hi Nuria, >> this has been a well know problem (or a feature depends how you look at it >> :)) of BioMart from it's beginnings. BioMart 0.7 performs an inner join only >> between tables within single dataset. The join between marts depends on data >> overlap between them and can behave either as an outer join when there is >> 100% overlap or as inner join if there is no full overlap (more typical >> scenario). This is of course not an ideal situation because it looks >> somewhat non-deterministic for the administrators. >> >> The mechanism of both joins (dataset and mart) is different. While the >> former is a simple SQL inner join, the latter is an 'inlist' that is an >> equivalent to multiple 'or' statements. The effect of this implementation >> was that all dimension tables in a dataset had to be a precomputed 'left >> join' tables ei tables with multiple NULLs which is not ideal for large >> tables with sparse records wasting a lot of space. >> >> BioMart 0.8 has inherited most solutions from 0.7. However it is supposed >> to be more flexible and perform an inner or outer join within the dataset >> depending on the administrator's settings that should available through >> MartConfigurator and also automatically set based on the initial MBuilder >> components settings. I do remember personally instructing one of the >> developers to include this feature in the QueryCompiler and it was there as >> a simple switch at some point but i doubth it that it ever made it to >> MConfigurator or MBuilder. >> >> The situation with the join between marts in 0.8 is identical to that of >> 0.7 (ei there are no provisions for the outer join). However, your email >> alerted me to this fact and this is definitely something that we need to >> think about. There is one more situation involving inlist that has not been >> solved either ei annotating external identifiers that effectively is an >> inner join as well and has been brought to our attention before. Technically >> the solution for both should be the same. This is definitely doable and >> something worth scheduling in the near future. >> >> >> Junjun: please check where we are with the DATASET outer join >> implementation. >> >> >> cheers, >> a >> >> >> >> On Mon, Sep 5, 2011 at 7:29 AM, <[email protected]> wrote: >> >>> Dear BioMart users,**** >>> >>> ** ** >>> >>> We are setting up a local database powered by BioMart inorder to annotate >>> our local data with the data of other marts. The problem is that the type of >>> join that BioMart is performing between tables seems to be an inner join >>> since BioMart solely outputs all the records that match in the two marts >>> linked. My question is, if this is true, are you planning to add outer joins >>> in order to get in addition all the unmatched results? In our case, if we >>> want to annotate our database with another mart data, we would perform a >>> left join to get all the matched records + all the unmatched records from >>> our database. And in case you are planning to implement this, when do you >>> predict that this implementation will be accessible for us, the BioMart >>> users?**** >>> >>> ** ** >>> >>> This type of outputting the query results, i.e. to output both the >>> matched and unmatched records from our database, is of highly interest in >>> order to annotate BUT without losing information from the query because of >>> the annotation process. Many thanks in advance.**** >>> >>> ** ** >>> >>> Please, do not hesitate to ask me for more information. **** >>> >>> ** ** >>> >>> Yours faithfully,**** >>> >>> ** ** >>> >>> NĂºria**** >>> >>> ** ** >>> >>> ** ** >>> >>> _______________________________________________ >>> Users mailing list >>> [email protected] >>> https://lists.biomart.org/mailman/listinfo/users >>> >>> >> >
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
