Hi Pablo,
I've been thinking to this whole stuff, and have a few additional thoughts
or remarks.
A. Lazy loading (the easiest option for a start).
I didn't get the point for the need of an EntitySetColumnExpression... Sorry
:( I'll dig this.
But regarding lazy loading without subexpression, the implementation should
be easy to do.
1. We need our own EntitySet<> in all cases (not only Mono).
2. The EntitySet<> is internally an IQueryable object (I know... This is not
the case in standard implementation, we need to find a workaround, but I
will consider this problem as minor), or wraps an IQueryable
3. The EntitySet<> also gets an internal list (to be filled on demand).
4.So the EntitySet<> would internally its list when filled or fill it when
empty.
5. The good point if we can get an IQueryable is that additional criteria
provided on the raw EntitySet<> should generate SQL clauses, hence we get a
really optimized subset.
B. Eagerly loading - this is where we begin to get scared (me, at least).
1. It is probably simple to add request columns, as you said we must there
consider the linked entities as a TableExpression.
2. Then we can find the related parent for a child using its FK target
(since this target is always a property declared as [Column]).
3. Reading the whole dataset would just require to add the idea of "current
entity whose children are being read".
4. BUT we can not handle anymore the paging: no more Skip() or Take()... At
least in SQL. We would then get very bad performance, since Skip() and
Take() would be done in CLR (I don't know how NHibernate handles this, for
example, but NH model is very different from ours anyway).
My suggestion is that we start with lazy loading.
Pablo, do we have a test with subexpressions in the tests set? I'd like to
see how the engine behaves, and then I may find answers to the
EntitySetColumnExpression idea.
Thanks,
Pascal.
On Tue, Sep 16, 2008 at 01:26, Pablo Iñigo Blasco <[EMAIL PROTECTED]> wrote:
> In this post I would like to explain some ideas about the 'complex
> projections' problem. Maybe the problem is controlled but I don't think so.
> Nonetheless if it were controlled, a discussion with comments would be
> interesting to contrast ideas and clarify concepts.
>
> 1. Introduction
> 2. DbLinq implementation state
> 3. Where and how to begin
>
> == Introduction ==
>
> I understand as "complex projections" those projections which contains
> associations accesses (AKA EntityRefs & EntitySets) or contains not-scalar
> nested queries.
>
> example:
> db.Orders.Select(o=>new {o.Employee, o.OrderDetails,
> o.OrderDetails.Select(od=> od.ProductID});
>
> In the above example you can see the three cases:
> · An association many to one (EntityRef or parent association)
> · An association one to many (EntitiSet or children association)
> · A not-scalar nested query.
>
> IMO this kind of projections are one of the most important features that
> Linq2Sql (and others O/R mappers) offers. This kind of queries allow us to
> retrieve hierarchied information (a tree) vs a sql queries, which return
> plain information (a table)
>
> What I mean with a tree structure?
>
> For example db.Employees.Select(e=>e.Orders) expression returns a
> IEnumerable<EntitySet<Order>> and this type has got a tree structure.
>
> AFAIK DbLinq currently doesn't support it yet, or at least it crashes in
> such cases. To implement it, probably we will need to make modifications
> into the CutOutOperands* stage and QueryRunner class, specially for
> EntitySet & not-scalar nested queries. Why? because the information of each
> tuple (of type EntitySet<Order>) is in more than one row.
>
> [* Have we got a better name for this stage? someone with a better English
> that mine?]
> Let's to see how MS Linq2Sql translates the above expression:
>
> SELECT [t1].[OrderID], [t1].[CustomerID], [t1].[EmployeeID],
> [t1].[OrderDate], [t1].[RequiredDate], [t1].[ShippedDate], [t1].[ShipVia],
> [t1].[Freight], [t1].[ShipName], [t1].[ShipAddress], [t1].[ShipCity],
> [t1].[ShipRegion], [t1].[ShipPostalCode], [t1].[ShipCountry], (
> SELECT COUNT(*)
> FROM [dbo].[Orders] AS [t2]
> WHERE [t2].[EmployeeID] = [t0].[EmployeeID]
> ) AS [value]
> FROM [dbo].[Employees] AS [t0]
> LEFT OUTER JOIN [dbo].[Orders] AS [t1] ON [t1].[EmployeeID] =
> [t0].[EmployeeID]
> ORDER BY [t0].[EmployeeID], [t1].[OrderID]
>
> the results are:
>
> 1 AIRBU 1 2008-09-13 21:55:23.703 NULL NULL NULL 21 NULL NULL NULL NULL
> NULL NULL 5
> 2 BT___ 1 2008-09-13 21:55:23.703 NULL NULL NULL 11 NULL NULL NULL NULL
> NULL NULL 5
> 3 BT___ 1 2008-09-13 21:55:23.703 NULL NULL NULL 12 NULL NULL NULL NULL
> NULL NULL 5
> 4 UKMOD 1 2008-09-13 21:55:23.703 NULL NULL NULL 33 NULL NULL NULL NULL
> NULL NULL 5
> 5 BONAP 1 1996-10-16 00:00:00.000 1996-11-27 00:00:00.000 1996-10-21
> 00:00:00.000 NULL 10 Bon app' 12, rue des Bouchers Marseille NULL NULL
> France 5
> NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 0
> NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 0
>
> Interesting points:
> - The last column (called 'value') informs to the number of 'children
> orders' given an employee.
> - Last two rows represent two employees without any child order
> As we have seen linq2sql build a tree structure from the table results
> using the auxiliary column 'value'.
> - Employee info is not retreived since it is not used.
>
> == DbLinq implementation state ==
>
> We have several problems.
>
> * Let's to analyze the above example: db.Employees.Select(e=>e.Orders)
>
> DbLinq generates the following sql query:
>
> SELECT [OrderID], [CustomerID], [EmployeeID], [OrderDate], [RequiredDate],
> [ShippedDate], [ShipVia], [Freight], [ShipName], [ShipAddress], [ShipCity],
> [ShipRegion], [ShipPostalCode], [ShipCountry]
> FROM [dbo].[Orders]
> WHERE e$.[EmployeeID] = [EmployeeID]
>
> Besides, the ExpressionDispatcher.Analyze don't work properly for this
> expression given that it returns a TableExpression of Order (at
> QueryBuilder.BuildExpressionQuery line:225).
>
> Therefore CutOutOperands creates a reader of type:
> Func<IDataRecord,MappingContext,Order> when it should be
> Func<IDataRecord,MappingContext,EntitySet<Order>> so a casting fails in
> QueryRunner stage.
>
> Three conclusions here:
> - EntitySet member access should not be transformed to TableExpression.
> Other way would be to add to the TableExpression class more information for
> detect this case after, at the CutOutOperands stage.
> - CutOutOperands method isn't fully implemented.
> - Sql query doesn't provides enough information for rebuilding a
> IEnumerable<EntitySet<Order>> structure.
>
> * Let's to analyze another example: db.Employees.Select(e=>new {e.Orders})
>
> This example crashes in NewExpressionMutator.Mutate(IList<Expression>). The
> problem is that the "e.Orders expression" is translated to a TableExpression
> (like the above case). Therefore an exception is thrown because there is no
> constructor with a TableExpression-parameter in the anonymous type (a
> similar problem of the "left-outer-join problem" and the Equal expression):
>
> Expression.New(NewExpression.Constructor, operands, NewExpression.Members);
>
> Conclusions here:
> * EntitySet member access expression currently work inside of a "new
> expression", (Will work EntityRefs inside of a "new expression"? I don't
> know, I would like check it)
> * We currently support to project scalar expressions and ColumnExpressions
> inside of a "new expression"
> * Likely non-scalar subqueries inside of a "new expression" currently don't
> work properly.
>
> There are another samples with other different problems, but I think that
> both examples are enough for now.
>
> == Where and how to begin ==
>
> I have two proposals about how to start fixing such problems:
>
> A) An option is to use lazy loading to make the a generic and simple
> implementation (with the cost of a worse performance).
>
> A.1- At ExpressionDispatcher.Analyze.MemberAccess. if the access is of type
> EntitySet<Z> it should be replaced as EntitySetColumnExpression. (like
> simpler member access expressions are replaced as ColumnExpressions.)
>
> example: db.Employees.Select (e=>e.Orders.Select(o=>new{o.OrderDetails})
> ---> db.Employees.Select (e=>new {EntitySetColumnExpression})
>
> A.2-
> · EntitySetColumnExpression should contains a reference to the original
> subexpression generated by the c# compiler, I'll call to such subexpression
> "G".
> · EntitySetColumnExpression should also contains a ColumnExpression of the
> PK column of the parent entity.
>
> example:
> esColumn.SourceExpression = Call(Select, (o=>new {o.OrderDetails}))) //
> this expression is called G
> esColumn.PseudoColumn = new ColumnExpression(TableX, name,...) (where
> name="EmployeeID")
>
> A.3- The sql query is generated. Such EntitySetColumnExpressions will be
> generated to sql using the PseudoColumn (ColumnExpressions)
>
> example:
> it should be something like: SELECT $t1.EmployeID FROM Employee $t1
>
> A.4- The rowCreator should consider that all EntitySeColumnExpressions
> should be converted to EntitySet<Z> type.
>
> var currentPK = row.GetIntegerAt(index);
> var value = new EntitySet<Z>();
> value.SetSource( G(db.ZTable.Where(o=>o.FK=currentPk)));
>
> Theoretically this should work even in the most complex nested expressions.
> However this implementation would be worse in performance, example:
>
> var orders=db.Employees.Select(e=>e.Orders); // Database contains 100
> employees
> foreach(var o in orders)
> {
> Console.WriteLine(o.Count());
> }
>
> The above piece of code make 101 database accesses.
> Nonetheless I think that such implementation would be a very good first
> approximation.
>
> B) Not to use lazy loading. I mean, retrieve all the information in a only
> sql query. Like Linq2Sql does.
>
> B.1 - We could include into the TableExpression some mark wich told us the
> TableExpression was originally an EntitySet member access expression.
>
> B.2 - We also need to change the generated sql. We need an auxiliar column
> that allows us build the object-tree.
> Two choices here:
> * Using a count expression column with the number of children
> of the parent entity (like MS linq2sql)
> * Project PK column of the parent entity (in our example
> Employees.EmployeeID).
>
> C) Your proposal
>
> Corrections, suggestions, ideas, all are welcome. Remember that theses
> discussions are also documentation and reference for the project.
>
> Regards.
>
> >
>
--
Pascal.
jabber/gtalk: [EMAIL PROTECTED]
msn: [EMAIL PROTECTED]
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"DbLinq" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/dblinq?hl=en
-~----------~----~----~----~------~----~------~--~---