Re: [nhibernate-development] A new scheme for eager loading

Niels Kühnel Thu, 17 Feb 2011 03:52:17 -0800

All right. I underestimated the penalty from the increased size of the
result set.
When not paging subselects are faster than my idea unless you have some
really complicated where clauses
When paging batching is faster unless you have big pages.
This means that it is not worth the effort.


Thanks for your input.

Now, subselects would be faster in all cases if the where clause could be
reused.

DECLARE @IDs TABLE (ID bigint primary key);
INSERT @IDs select TOP 100 ID FROM [Person] WHERE {tricky stuff}

SELECT {all fields} FROM [Person] WHERE ID IN (SELECT ID FROM @IDs)
SELECT {all fields} FROM [Tag] tags0_ WHERE tags0_.Person_id in (select ID
FROM @IDs)
SELECT {all fields} FROM [Phonenumber] phonenumbe0_ WHERE
phonenumbe0_.Person_id in (select ID FROM @IDs)

What do you think is the best way to make queries like this?

Thanks,
Niels



On 16 February 2011 21:01, Niels Kühnel <[email protected]> wrote:

> I'm not sure why i used INNER JOIN .. 1=1. Cross join is more clear,
> thanks.
>
> Anyway, I like the power in making general expressions as:
> "I want to show these companies in my list, and for all of them I need
> these relations for what I'm going to show. Please give me them as fast as
> possible, but I don't really care how"
>
> _session.QueryOver<Company>().Skip(20).Take(20).OrderBy(x=>x.Name).Asc)
>                 .EagerFetchMany(x => x.PushReportInfos)
>                 .EagerFetch(x => x.AgreementStatus)
>                 .EagerFetchMany(x => x.SecurityTokens)
>                 .EagerFetchMany(x => x.NaceEntries)
>                 .EagerFetchMany(x => x.NaceEntries.First().NaceEntry)
>                 .Where "some really complicated stuff with EXISTS and
> stuff"
>
> You can get that with subselects too, but in that case "some really
> complicated stuff" will be repeated for each sub select.
> In my approach filtering is only done once, and the database engine simply
> joins the tables on the resulting root entities
>
> If you follow an approach whith a query for only the root entities and then
> get the rest by using some kind of " WHERE IN (root ids)" approach
> you're essentially doing the same as me, except that your need more code
> and you are sending a lot of ids to and fro.
> You are especially likely to get into trouble if you need nested relations
> or if you for some reason need to consider thousands of entities.
>
> I still need to do some performance comparisons though, as everything else
> is just hand waving :)
>
> On 16 February 2011 20:36, nadav s <[email protected]> wrote:
>
>> fetch=subselect does a subselect for the original query that isn't really
>> neccessary because you have the ids in the session, and that subselect might
>> be a big one sometimes.
>> I think batch size is the best approach.
>> The thing is that currently you have to set the value in the mappings...
>> there is a patch i've uploaded to jira, i know fabio doesn't like it,
>> still don't really understand why exactly, but it enables you to override
>> the batch size that was set in the mappings per session, and i think it
>> could be useful in plenty of use cases.
>>
>>
>> On Wed, Feb 16, 2011 at 7:41 PM, Fabio Maulo <[email protected]>wrote:
>>
>>> You can try even fetch="subselect" to upload each collection in just one
>>> roundtrip
>>> For instance given a Company with a collection of Contacts when you get
>>> the first page of Companies NH will fetch all contacts of the first page in
>>> one roundtrip.
>>> fetch="subselect" may fail in some circumstances under MSSQL thank to its
>>> nice way to get pagination (perhaps in 2012 the will realize that
>>> result-sets are paged in web and that developers are humans).
>>>
>>>
>>> On Wed, Feb 16, 2011 at 2:21 PM, Diego Mijelshon <[email protected]
>>> > wrote:
>>>
>>>> Niels,
>>>>
>>>> Your concepts are interesting.
>>>> A simple question: why are you using INNER JOIN (...) ON 1=1 instead of
>>>> CROSS JOIN?
>>>>
>>>> Anyway: while the whole idea might work, it would be interesting to
>>>> measure it against existing approaches using real use cases. I suspect
>>>> (read: guess) the gains do not offset the loses.
>>>> Also, my general approach is to NEVER fetch collections. I prefer to use
>>>> load batching, or a couple roundtrips.
>>>> As applied to paging, this will get even easier when we introduce HQL
>>>> syntax for paging.
>>>>
>>>> But as said: all research is useful, even if we don't end up using the
>>>> exact original idea.
>>>>
>>>>     Diego
>>>>
>>>>
>>>>
>>>> On Wed, Feb 16, 2011 at 10:15, Niels Kühnel <[email protected]>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm new on this list, so I'm very sorry if I break all your rules...
>>>>>
>>>>> I have devised a new way for eager loading collections, collections'
>>>>> collections etc. at once without making the Cartesian product. I've
>>>>> done a proof of concept implementation but I think it's solid enough
>>>>> for further investigation
>>>>>
>>>>> The basic idea is to add a "fork" to the query to add the missing
>>>>> structure when two unrelated tables are joined as:
>>>>>
>>>>> SELECT A.ID AS A, Forks.Fork, B.ID AS B, C.ID AS C FROM A
>>>>>    INNER JOIN (SELECT 1 AS Fork UNION ALL SELECT 2) AS Forks ON 1=1
>>>>>    LEFT JOIN B ON B.A = A.ID AND Forks.Fork = 1
>>>>>    LEFT JOIN C ON C.A = A.ID AND Forks.Fork = 2
>>>>>
>>>>> While spending some time with eager loading I've also improved paging
>>>>> in Sql server 2005+/CE 4.0 so that you can get root entities, say, 10
>>>>> to 20 even though the query has eagerly fetched collections. I haven't
>>>>> yet looked into if the same method can be used with other databases
>>>>>
>>>>> If you have the time, you can read more about my ideas at
>>>>>
>>>>> http://kuhnel.wordpress.com/2010/10/15/the-eager-fork-%E2%80%93-taming-cartesian-explosions-in-databases/
>>>>> and
>>>>>
>>>>> http://kuhnel.wordpress.com/2011/02/13/paging-eager-fetched-nhibernate-collections-efficiently/
>>>>>
>>>>> and some proof of concept code at
>>>>> https://bitbucket.org/nielskuhnel/eager-nh-extensions
>>>>>
>>>>> I still have to do a proper performance comparison against lazy
>>>>> loading and futures but my ad hoc results are encouraging. I've also
>>>>> studied execution plans in SQL Server and the "fork" tables doesn't
>>>>> seem to upset it in any way.
>>>>> It's implemented as a custom dialect so no changes to the core are
>>>>> needed per se.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Niels Kühnel
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Fabio Maulo
>>>
>>>
>>
>

Re: [nhibernate-development] A new scheme for eager loading

Reply via email to