Re: [nhibernate-development] implementing FetchMode.SubSelect per query, and improving it

Fabio Maulo Wed, 01 Sep 2010 10:11:54 -0700

ah...
take care with "using IDs is more efficient" because "subselect" does not
suffer the problem of max-parameter (IIRC 2100 in msSQL)


On Wed, Sep 1, 2010 at 2:03 PM, nadav s <[email protected]> wrote:

> great. thanks
>
>
> On Wed, Sep 1, 2010 at 7:56 PM, nadav s <[email protected]> wrote:
>
>> you know the internal of nhibernate much much much better than me, and i
>> won't get into an implementation argue with you, but it is possible to
>> implement.
>>
>> with subselect (again i'm talking about subselect because i didn't do any
>> research on the batch size, but i guess the idea is similar because it works
>> the same, only batch size issues a good query and subselect issues an evil
>> one), as i've noticed, there is a special one-to-many collection persister,
>> that knows once the collection is accessed, use a sub select batcher that
>> loads the collections of all the owners that were returned by the initial
>> query.
>>
>> if the persister could have been set, or modified, for a specific instance
>> of a collection, it would have been possible - you could have set the batch
>> size\subselect for a specific query, which in turn would have set a
>> different persister for the collections that their persisters needs
>> modification, and then when a collection would have been accessed, the
>> persister would have done its thing.
>>
>> of course, i'm not sure thats the proper way of implementing it, but as an
>> idea - tell the specific collections that are created for the entities of a
>> specific query to do something else than the default, it is possible
>>
>>
>> On Wed, Sep 1, 2010 at 7:49 PM, John Davidson <[email protected]>wrote:
>>
>>> I think nadav is saying that subselect from NHibernate is an issue, but
>>> the implementation he is proposing will fix that problem
>>>
>>> John Davidson
>>>
>>>
>>> On Wed, Sep 1, 2010 at 12:46 PM, Fabio Maulo <[email protected]>wrote:
>>>
>>>> LOL!!
>>>> Your first assertion : "btw, i don't really get what is the problem
>>>> with subselect"
>>>> Your second assertion : "the sub select is always inefficient"
>>>>
>>>> On Wed, Sep 1, 2010 at 1:42 PM, nadav s <[email protected]> wrote:
>>>>
>>>>> the sub select is always inefficient, especially when there is an
>>>>> initial complex query (with sub queries in it), and its a killer when its 
>>>>> a
>>>>> two level tree (when fetching the grandchildren). fixing it was really
>>>>> really easy, and i can't see any downside to it.
>>>>>
>>>>> different use cases in a web app:
>>>>>
>>>>> use case 1: sub select\batch size is NOT desired
>>>>>
>>>>>    the user searches for car companies by some criteria. the user will
>>>>> then choose (double click on a grid's row or something) one of the
>>>>>    companies to see it in full details. each company has one-to-many
>>>>> car types (mazda -> mazda 3, mazda 5, mazda 6...) and each
>>>>>    car type will be displayed in its own tab, when at first, the newest
>>>>> car type or the most expensive one, doesn't matter is selected.
>>>>>    each car type has its models, mazda3 2008 isn't the same as 2010 (i
>>>>> don't that much about cars and not sure the years are correct,
>>>>>    but there are differences between the models).
>>>>>
>>>>>    the result: if carType.Models is mapped with some batch size, say
>>>>> 10, the models of 10 of the car types are now fetched, although
>>>>>    the user only watches the models of one of the car types, if there
>>>>> could be lots of models for each car type, it slowed the first tab,
>>>>>    and made the other tabs faster, because their car types are now
>>>>> loaded, but its not what is desired, because the user is expected to
>>>>>    click on only one of other tabs or something.
>>>>>
>>>>>  use case 2: desired:
>>>>>
>>>>>     the user wanna see some custom developed report (ui that can be
>>>>> implemented with MRS/Cognus or any other reporting framework,
>>>>>     and we have all kinds of reports that live up to this definition,
>>>>> and for some good reasons also). for the report the user searches for
>>>>>     car companies by some criteria (some search form) and then expects
>>>>> to see the returned companies, paged of course, but with all
>>>>>     of their car types, and for each of the car type - all of its
>>>>> models. here, a sub select or batch fetching is a must or else we'll get a
>>>>> CP
>>>>>     with join fetching, or N^2 + 1 if we do regular lazy loading (like
>>>>> we wanted to do in the first situation).
>>>>>
>>>>> of course we can work around that, and thats exactly what we do, using
>>>>> a generic mechanizm that for reports, eager fetches with sub selects and 
>>>>> not
>>>>> joins, the association it was asked to fetch. for the regular queries, it
>>>>> just use the default which is regular lazy.
>>>>>
>>>>> it would have been really really nice, if i could have set, for the
>>>>> report query, query.SetFetchMode("CarTypes", FetchMode.SubSelect)
>>>>> or if you will, query.SetBatchSize("CarTypes", 20)
>>>>> and same for models
>>>>> query.SetFetchMode("CarTypes.Models", FetchMode.SubSelect) or
>>>>> query.SetBatchSize("CarTypes.Models", int.MaxValue).
>>>>>
>>>>> it must be max value because i want all the models, and can't possibly
>>>>> know how many car types are going to be there. of course it won't be alot,
>>>>> because the "query" is going to use paging, but i don't really know if its
>>>>> 20, 40, or something else.
>>>>>
>>>>> batch size, currently makes me choose between the use cases, slowing
>>>>> down one of them, or makes me query and connect the associations my self.
>>>>> same goes for sub select, which also issues an inefficient query for
>>>>> CarTypes and a killer query for the Models
>>>>> before my fix it would have been:
>>>>> select ...
>>>>> from Models m
>>>>> where m.CarTypeId in
>>>>>    (select c.Id
>>>>>     from CarTypes c
>>>>>     where c.CompanyId in
>>>>>             (select company.Id
>>>>>              from Companies company
>>>>>              where <could be some crazy crteria - this is the same
>>>>> where clause of the very original query>))
>>>>>
>>>>>
>>>>>
>>>>> (i was able to make itthe inefficiency of the query
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Sep 1, 2010 at 6:58 PM, Fabio Maulo <[email protected]>wrote:
>>>>>
>>>>>> I don't know which is the problem... you said that there is a problem
>>>>>> and you want change it using the same tech used by batch-size (using
>>>>>> uploaded ids) because subselect seems inefficient in some cases.
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 1, 2010 at 12:48 PM, nadav s <[email protected]> wrote:
>>>>>>
>>>>>>> btw, i don't really get what is the problem with subselect, as it
>>>>>>> lets you efficiently fetch a whole object graph for the N fathers that 
>>>>>>> were
>>>>>>> fetched in some query, in the most efficient way possible
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Sep 1, 2010 at 6:46 PM, nadav s <[email protected]> wrote:
>>>>>>>
>>>>>>>> i don't think its thats low priority, because it is actually a thing
>>>>>>>> people expect to happen when they set a fetch mode to Eager, at least 
>>>>>>>> i've
>>>>>>>> seen alot of situations when people really thought that thats whats 
>>>>>>>> going to
>>>>>>>> happen  (later finding out it killed their query with CP)
>>>>>>>>
>>>>>>>> about when it is helpful - exactly in the situations diego
>>>>>>>> described. two use cases,
>>>>>>>> in one of them you query the fathers and gonna need only one of the
>>>>>>>> father's collection, and for the other
>>>>>>>> you're gonna need all of their collections.
>>>>>>>> it gets more complicated when there are grandchildren involved, and
>>>>>>>> in one of the situations you want the grand children of one of the 
>>>>>>>> childs,
>>>>>>>> and in the other situation, because you load an object graph, you're 
>>>>>>>> gonna
>>>>>>>> need all of them.
>>>>>>>>
>>>>>>>> now, either you implement (similar to what diego said) the loading
>>>>>>>> of the collections yourself, or you gonna have to live with the batch 
>>>>>>>> size
>>>>>>>> slowing down the first situation, where you would have prefered lazy 
>>>>>>>> loading
>>>>>>>> without batching
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Sep 1, 2010 at 5:22 PM, Diego Mijelshon <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> I have entities where batch loading helps in some use cases but it
>>>>>>>>> loads lots of unneeded entities/collections in other complex use 
>>>>>>>>> cases,
>>>>>>>>> where I have many proxies but only use a few.
>>>>>>>>> My current workaround is doing "manual batch loading" (i.e. dummy
>>>>>>>>> query) in the cases where I need it.
>>>>>>>>>
>>>>>>>>> It would be definitely a low-priority but nice-to-have feature.
>>>>>>>>>
>>>>>>>>>     Diego
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Sep 1, 2010 at 10:12, Fabio Maulo <[email protected]>wrote:
>>>>>>>>>
>>>>>>>>>> It is possible for batcher (INSERT, UPDATE,DELETE).
>>>>>>>>>> I don't understand where it is useful for collection/relations
>>>>>>>>>> batch-size.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 1, 2010 at 9:37 AM, Diego Mijelshon <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Being able to override batch-size would be useful. Implementing
>>>>>>>>>>> it requires messing with more than one part of the infrastructure, 
>>>>>>>>>>> though.
>>>>>>>>>>>
>>>>>>>>>>>     Diego
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Fabio Maulo
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Fabio Maulo
>>>>
>>>>
>>>
>>
>


-- 
Fabio Maulo

Re: [nhibernate-development] implementing FetchMode.SubSelect per query, and improving it

Reply via email to