We've been having a lot of issues with N + 1 fetch problems with our 
NHibernate implementation.

We have a very complex data schema / relational graph.

After much research, it appears that there are currently four main 
approaches to resolving relationships in NH:

1) Eager fetching. Not an option for complex object graphs because of the 
magnitude of the required request.

2) Lazy fetching out of the box. Results in N + 1 queries. If many child 
objects are referenced, gets prohibitive very quickly.

3) Use `Fetch()` to "eagerly" resolve only the fields you expect to need. 
This works well until you need to use multiple `FetchMany()` calls, which 
is very inefficient and results in duplicate objects in collections because 
of the Cartesian product that results from the query.

4) Use `ToFuture()` to cache some queries for NHibernate. I have had a lot 
of difficulty linking cached queries to the nested properties / references 
of a final query. Perhaps I'm just doing it wrong.

After much thought, I decided that if I extended lazy fetching with a group 
load similar to `Fetch()` I would have a very efficient solution without 
the Cartesian product. Here's some sample code demonstrating how something 
like this might be consumed (using FluentNHibernate with NHibernate.Linq).

var recentOrders = session.Query<Order>()
    .Where(order => order.PlacedDate > 5.Days().Ago());

recentOrders.Load(session, order => order.Customer);
recentOrders.LoadAll(session, order => order.Items, items =>
    {
        items.Load(session, item => item.ProductCategory);
    });

This would result in an initial query to get the orders that were placed in 
the last five days.

The first load call would cause another query that would get all customers 
filtered by the recent order foreign keys. These customers would then be 
mapped into the recent order objects (LazyLoading would be fulfilled so 
that an object reference would not trigger a query).

The second load call would send a query getting all items with a foreign 
key matching against the recent orders, these would also be mapped into the 
recent orders objects. 

Finally, a query would be performed against the Product Category table, 
filtering down to only categories matching any of the items in any of the 
recent orders.

This solution does not result in any Cartesian products because each query 
only joins two tables. Each load / resolve step only requires 1 query, 
however, instead of N, where N is the number of child objects.

I attempted to actually implement this solution and immediately ran into 
difficulty getting information (mapping information particularly) from 
NHibernate. Here's an initial attempt to implement `Load()` that seems like 
it works but it is hard to tell initially:

public static void Load<TRootEntity, TChildEntity>(this IEnumerable<
TRootEntity> entities, ISession session, Expression<Func<TRootEntity, 
TChildEntity>> getChild, Action<IEnumerable<TChildEntity>> loadChildren = 
null)
    where TChildEntity : DomainEntity
{
    var childProperty = getProperty(getChild);
    var childIds = entities.Select(parent => getChildId(parent, 
childProperty)).Distinct().ToArray();

    var childDictionary = session.Query<TChildEntity>()
        .Where(child => childIds.Contains(child.Id))
        .ToDictionary(child => child.Id);

    entities.Each(parent => parent.setChild(childDictionary, childProperty
));

    if (loadChildren != null)
    {
        loadChildren(childDictionary.Values);
    }
}

public static int getChildId<TRootEntity>(TRootEntity parent, PropertyInfo 
childProperty)
{
    var proxy = childProperty.GetValue(parent) as INHibernateProxy;
    return (int) proxy.HibernateLazyInitializer.Identifier;
}

private static void setChild<TRootEntity, TChildEntity>(this TRootEntity 
parent, IDictionary<int, TChildEntity> childDictionary, PropertyInfo 
childProperty)
    where TChildEntity : DomainEntity
{
    var childId = getChildId(parent, childProperty);
    var child = childDictionary[childId];
    var childProxy = child as INHibernateProxy;

    var proxy = childProperty.GetValue(parent) as INHibernateProxy;
    proxy.HibernateLazyInitializer.SetImplementation(childProxy.
HibernateLazyInitializer.GetImplementation());
}

private static PropertyInfo getProperty<TPropertyOwner, TPropertyReturn>(
Expression<Func<TPropertyOwner, TPropertyReturn>> getterExpression)
{
    var propertyName = UiNameHelper.BuildNameFrom(getterExpression);
    return typeof(TPropertyOwner).GetProperty(propertyName, typeof(
TPropertyReturn));
}

This solution does not attempt to update any child collection properties 
that point back to the parent object, which could be a big problem.

I attempted to implement `LoadAll()` and quickly got stuck because there 
was no easy way to get the parent ID from the child objects. One solution 
to this is to add an expression for getting the parent from the child, 
similar to the collection expression, but this seems like it should not be 
necessary since the NH map already has the key column for the has many 
relationship. Furthermore, I have idea how to modify the 
PersistentGenericBag<TRootEntity> on the parent without triggering it to 
fetch via a query to the database (thereby defeating my purpose).

Does this seem like potentially a good solution? Is there a good way to 
implement this currently? What if support for something like this was built 
into NHibernate?

~Josh Graber

-- 
You received this message because you are subscribed to the Google Groups 
"nhusers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/nhusers.
For more options, visit https://groups.google.com/d/optout.

Reply via email to