just replied to it :)

On Tue, Sep 6, 2011 at 5:50 PM, Junjun Zhang <[email protected]>wrote:

> Hi Arek,
>
> Can you please elaborate more on the follow point mentioned in your
> previous response?
>
> "There is one more situation involving inlist that has not
> been solved either ei annotating external identifiers that effectively is
> an inner join as well and has been brought to our attention before."
>
> Thanks,
> Junjun
>
>
> From: Arek Kasprzyk <[email protected]>
> Date: Tue, 6 Sep 2011 13:29:01 -0400
> To: jzhang <[email protected]>
> Cc: "[email protected]" <[email protected]>, "[email protected]" <
> [email protected]>
>
> Subject: Re: [BioMart Users] OUTER JOIN between tables
>
> Junjun:
> my three cents ;)
>
> 1. I would definitely NOT expose end users to understand the querying logic
> for inter mart queries.
> 2. left join with inter mart querying needs to be solved with in list
> annotation at the same time
> 3. The 'on the fly' left join was tested two years ago when i asked for the
> query compiler to be extended so we know well how it performs :)
>
> a.
>
>
>
>
>
>
> On Tue, Sep 6, 2011 at 12:53 PM, Junjun Zhang <[email protected]>wrote:
>
>> Hi Nuria,
>>
>> Your use case is well understood. As left-join is a more typical scenario
>> for inter-mart queries, we are considering to make left-join the default
>> behaviour in the next release. Technically this is not hard to implement,
>> however, we'd like to be very careful about making this type of decision as
>> it has global impact to the whole querying system. In this case, we will
>> need to provide means to enable deployers (maybe end users as well) to alter
>> the default join behaviour from the web GUI and the query XML.
>>
>> For join with a dataset between main table and dimension table, we can
>> continue to support precomputed 'left join' as Arek mentioned below. We can
>> also make it an optional for performing 'left join' on the fly. The only
>> concern is that under certain situations the latter may have some
>> performance disadvantages.
>>
>> We will keep you informed on this.
>>
>> Hope this helps!
>>
>> Best regards,
>> Junjun
>>
>>
>> From: Arek Kasprzyk <[email protected]>
>> Date: Mon, 5 Sep 2011 09:55:42 -0400
>> To: "[email protected]" <[email protected]>
>> Cc: "[email protected]" <[email protected]>
>> Subject: Re: [BioMart Users] OUTER JOIN between tables
>>
>> Hi Nuria,
>> this has been a well know problem (or a feature depends how you look at it
>> :)) of BioMart from it's beginnings. BioMart 0.7 performs an inner join only
>> between tables within single dataset. The join between marts depends on data
>> overlap between them and can behave either as an outer join when there is
>> 100% overlap or as inner join if there is no full overlap (more typical
>> scenario). This is of course not an ideal situation because it looks
>> somewhat non-deterministic for the administrators.
>>
>> The mechanism of both joins (dataset and mart) is different. While the
>> former is a simple SQL inner join, the latter is an 'inlist' that is an
>> equivalent to multiple 'or' statements. The effect of this implementation
>> was that all dimension tables in a dataset had to be a precomputed 'left
>> join' tables ei tables with multiple NULLs which is not ideal for large
>> tables with sparse records wasting a lot of space.
>>
>> BioMart 0.8 has inherited most solutions from 0.7. However it is supposed
>> to be more flexible and perform an inner or outer join within the dataset
>> depending on the administrator's settings that should available through
>> MartConfigurator and also automatically set based on the initial MBuilder
>> components settings. I do remember personally instructing one of the
>> developers to include this feature in the QueryCompiler and it was there as
>> a simple switch at some point but i doubth it that it ever made it to
>> MConfigurator or MBuilder.
>>
>> The situation with the join between marts in 0.8 is identical to that of
>> 0.7 (ei there are no provisions for the outer join). However, your email
>> alerted me to this fact and this is definitely something that we need to
>> think about. There is one more situation involving inlist that has not been
>> solved either ei annotating external identifiers that effectively is an
>> inner join as well and has been brought to our attention before. Technically
>> the solution for both should be the same. This is definitely doable and
>> something worth scheduling in the near future.
>>
>>
>> Junjun: please check where we are with the DATASET outer join
>> implementation.
>>
>>
>> cheers,
>> a
>>
>>
>>
>> On Mon, Sep 5, 2011 at 7:29 AM, <[email protected]> wrote:
>>
>>> Dear BioMart users,****
>>>
>>> ** **
>>>
>>> We are setting up a local database powered by BioMart inorder to annotate
>>> our local data with the data of other marts. The problem is that the type of
>>> join that BioMart is performing between tables seems to be an inner join
>>> since BioMart solely outputs all the records that match in the two marts
>>> linked. My question is, if this is true, are you planning to add outer joins
>>> in order to get in addition all the unmatched results? In our case, if we
>>> want to annotate our database with another mart data, we would perform a
>>> left join to get all the matched records + all the unmatched records from
>>> our database. And in case you are planning to implement this, when do you
>>> predict that this implementation will be accessible for us, the BioMart
>>> users?****
>>>
>>> ** **
>>>
>>> This type of outputting the query results, i.e. to output both the
>>> matched and unmatched records from our database, is of highly interest in
>>> order to annotate BUT without losing information from the query because of
>>> the annotation process. Many thanks in advance.****
>>>
>>> ** **
>>>
>>> Please, do not hesitate to ask me for more information. ****
>>>
>>> ** **
>>>
>>> Yours faithfully,****
>>>
>>> ** **
>>>
>>> NĂºria****
>>>
>>> ** **
>>>
>>> ** **
>>>
>>> _______________________________________________
>>> Users mailing list
>>> [email protected]
>>> https://lists.biomart.org/mailman/listinfo/users
>>>
>>>
>>
>
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to