[dblinq] Re: new release ?

Jonathan Pryor Fri, 15 May 2009 10:37:28 -0700

I think the correct approach is to use CompiledQuery, for precisely the
cache management reasons I originally outlined.


Though long-lived DataContexts would be problematic, as you point out,
as you'd effectively clone the entire DB into memory given enough time
and queries...  I don't see how a connected database approach (which
DataContext follows) could behave otherwise.

Does DbLinq support compiled queries?  find says:

        $ find src -name CompiledQuery.cs
        src/DbLinq/System.Data.Linq/CompiledQuery.cs

Survey says...No.  I believe we should.

I'm not sure what you mean by our queries being static.  Once we cache a
query, we don't change it anymore.  (If we did, we'd be altering the
original query, which would be...bad.)

I'm also not sure what you're getting at with closures.

The way I envision CompiledQuery working is very similar/identical to
how your existing query caching works, except instead of storing the
SelectQuery type within QueryCache, it would instead be stored within
the delegate returned from CompiledQuery.Compile().  Thus, the user
actually deals with pre-parsed query statements, and DbLinq doesn't need
to re-parse the LINQ statement again (as the delegate returned by
CompiledQuery.Compile() stores the pre-parsed SelectQuery instance).

I'm not sure how much work this would take, but I'm hopeful that it
wouldn't be too much work.

However, this does bring up two related issues.

1. I still need to figure out wtf is going on with the NerdDinner
caching bug I was seeing last week (and better, how to reproduce in the
unit tests so that you can take a better look at it).

2. SelectQuery.GetCommand() gives me really bad feelings, because:

     A. It takes no arguments.
     B. It calls InputParameterExpression.GetValue() with no parameters.
     C. (A) and (B) together imply that, even though the underlying
        SELECT takes named parameters (yay), there's no way to actually
        provide them/alter them for the current SelectQuery instance
        (wtf?).

I think this is why (1) fails for me (but again, I still need to debug).

In any event, it makes no sense to me at all.  The point to having a
SELECT with parameters is so that you can cache the expression itself
but vary the parameters.  But since we're not providing any parameters,
the parameters can't vary.  So...

It makes my head hurt, if nothing else.

I would instead expect AbstractQuery.GetCommand() to take an 'object[]
parameters' argument (or similar) so that we can cache the actual select
statement w/o associated parameters.  This would also dovetail nicely
with the semantics CompiledQuery.Compile(), as you can provide
parameters to the expression you're compiling:

        var pepleWithLastName = CompiledQuery.Compile(
                (PeopleDb db, string lastName, int start, int count) =>
                        (from p in db.People
                         where p.LastName == lastName
                         select p)
                        .Skip(start)
                        .Take(count));
        foreach (p in peopleWithLastName(myDB, "Foo", 0, 1)) ...

Alas, CompiledQuery.Compile() will only create delegates accepting 4
parameters, but there are workarounds...
 - Jon

On Fri, 2009-05-15 at 16:35 +0200, Giacomo Tesio wrote:

> I'll really hope is it right... :-|
> It would be a great problem otherwise.
> 
> AFAIK, the datacontext rappresent a UnitOfWork. If the unit of work is
> long as the application lifetime, than keeping it alive is right.
> But in an internet application delivered via http, I've got my doubt.
> 
> If it's not readonly, the tracked entities would become an unaligned
> copy of the full database... in memory.
> 
> 
> That said, does DbLinq support compiled queries?
> 
> The problem with this approach, would be that our queries are not
> static: we progressively add clausoles to IQueryable<T>.
> And, how to handle with closures?
> 
> 
> Giacomo
> 
> 
> 
> 
> On Fri, May 15, 2009 at 1:37 PM, Jonathan Pryor <[email protected]>
> wrote:
> 
>         I think you'll find that you're doing it wrong. :-)
>         
>         First, I'm not sure that the assertion that all apps have
>         short-lived DataContexts is correct.  That's certainly not the
>         case for NerdDinner, which has only one DataContext for the
>         lifetime of the app.  Many apps may have short-lived
>         DataContexts, but many won't.
>         
>         Secondly, and primarily, Microsoft doesn't do implicit query
>         caching.  They do explicit query caching:
>         
>                 
> http://blogs.msdn.com/ricom/archive/2008/01/11/performance-quiz-13-linq-to-sql-compiled-queries-cost.aspx
>                 
> http://blogs.msdn.com/ricom/archive/2008/01/14/performance-quiz-13-linq-to-sql-compiled-query-cost-solution.aspx
>         
>         For example: 
>         
>                 var fq = CompiledQuery.Compile 
>                 ( 
>                     (Northwinds nw) => 
>                             (from o in nw.Orders 
>                             select new 
>                                    { 
>                                        OrderID = o.OrderID, 
>                                        CustomerID = o.CustomerID, 
>                                        EmployeeID = o.EmployeeID, 
>                                        ShippedDate = o.ShippedDate 
>                                    }).Take(5) 
>                 );
>         
>         The result of CompiledQuery.Compile is a pre-compiled,
>         pre-analyzed query, which the user is responsible for caching
>         and dealing with.  Query caches are not part of DataContext
>         itself, precisely because it's a recipe for a giant memory
>         leak.
>         
>         (Consider a fictional app which uses Linq-to-SQL once at
>         startup, or otherwise very infrequently.  The DbLinq approach
>         would assure that the original cached queries would never be
>         freed; it would be a permanent memory tax on the app.)
>         
>         I would strongly suggest that you follow Microsoft's approach,
>         drop the DataContext query caching, and use CompiledQuery
>         instead.
>         
>         - Jon
>         
>         
>         
>         
>         On Fri, 2009-05-15 at 10:25 +0200, Giacomo Tesio wrote:
>         
>         > OFFTOPIC about the interfaces
>         > IQueryCache is an example of an interface which must not be
>         > removed. This way if a skilled programmer would really need
>         > it, it could reimplement it to better fill it's requirement.
>         > 
>         > Even if using ReaderWriterLockSlim, thread safety has a
>         > cost: some one could require speed rather than thread
>         > safety.
>         > Than it should be possible to reimplement it.
>         > 
>         > Another use of such an interface could be to move the cache
>         > out of the appdomain (in dedicated cache servers) and share
>         > them among servers. This would also make the cache livecicle
>         > longher than the iis application.
>         > I'm not a fan of this solution (I'm not sure this would lead
>         > in better performances), but in an enteprise environment
>         > like ours, it had to be possible to take such a decision
>         > later.
>         > 
>         > That's why I think that good interfaces are a good thing: we
>         > should not decide how DbLinq as to be used (tecnologically
>         > speaking of course).
>         > 
>         > Having GOOD internal interfaces allow better flexibility in
>         > the long run, and produce a better open source product.
>         > 
>         > 
>         > BAD interfaces, on the other hand, reduce flexibility and
>         > improve developments effort, but I think that they always
>         > underling a wrong analisis or design (if not a completely
>         > missing one).
>         > 
>         > 
>         > I encounter often .NET programmers talking against
>         > interfaces. But till now, I've always noticed they are
>         > talking about wrong interfaces they have designed bottom up.
>         > It could took much time to explain a Microsoft .NET
>         > developer that the problem are not the interfaces, the
>         > problem is their design.
>         > 
>         > More or less like explaing them that a Domain Model IS
>         > Object Orientation, not a way of doing Object Orientation.
>         > Or to explain them that a object oriented language does not
>         > lead by itself to object oriented software...
>         > 
>         > 
>         > Ok... I don't like Microsoft. :-D
>         > 
>         > 
>         > 
>         > Giacomo
>         > 
>         > 
>         > On Fri, May 15, 2009 at 10:00 AM, Giacomo Tesio
>         > <[email protected]> wrote:
>         > 
>         >         That's a good question, i think.
>         >         
>         >         QueryCache is a cache of generated queries for each
>         >         expression tree evaluated.
>         >         It actually has to be static (at least thread
>         >         static, but this would multiply the memory usage per
>         >         number of threads, also reducing the hits and
>         >         reducing livetime to the thread one) to improve the
>         >         hits, since in the most common DataContext use case,
>         >         it rappresents a unit of work and have short live.
>         >         If the QueryCache livecicle would match the
>         >         DataContext one, probably it would have no reason to
>         >         exists.
>         >         
>         >         In "our" DbLinq use case, a readonly DataContext is
>         >         used from all threads and has a long live (the
>         >         AppDomain one), while single unit of works actually
>         >         are created per request or on a per need basis.
>         >         
>         >         The global readonly one, has to be no "instance
>         >         caches" at all (no object tracking for example, and
>         >         I hope there are no other caches... but actually I
>         >         should indagate this more), but still need a
>         >         QueryCache becouse it could share its yet parsed
>         >         expression tree among threads.
>         >         
>         >         Moreover all the DbLinq DataContexts would benefit
>         >         from such a static queries, greatly increasing the
>         >         performances (I hope! ! ! :-D).
>         >         
>         >         Other caches (like the EntityTrakings one) must not
>         >         be static since they are conceptually linked to the
>         >         DataContext that fill and use them.
>         >         
>         >         
>         >         
>         >         
>         >         Giacomo 
>         >         
>         >         
>         >         
>         >         
>         >         On Thu, May 14, 2009 at 5:19 PM, Jonathan Pryor
>         >         <[email protected]> wrote: 
>         >         
>         >                 This is bound to be a stupid/silly question,
>         >                 but why do the caches need to be static?
>         >                 Static data is effectively global, i.e. a GC
>         >                 root in and of itself, and thus will never
>         >                 be collected.  Even with a good policy, it's
>         >                 possible that this could use more memory
>         >                 than people would expect.
>         >                 
>         >                 See also:
>         >                 
>         >                 
> http://blogs.msdn.com/oldnewthing/archive/2006/05/02/588350.aspx
>         >                 
>         >                 
> http://blogs.msdn.com/ricom/archive/2004/01/19/60280.aspx
>         >                 
>         >                 Is there really a need for a cache that's
>         >                 static (i.e. shared amongst all DataContext
>         >                 instances)?  Or can it just be non-static
>         >                 and attached to the DataContext (which would
>         >                 also remove all thread safety requirements).
>         >                 
>         >                 Put another way, with non-shared caches if
>         >                 the DataContext gets collected then the
>         >                 cache is also collected, thus providing a
>         >                 natural mechanism to clear the cache.  With
>         >                 shared (static) caches, they're not
>         >                 connected to the DataContext, and thus it
>         >                 could be holding cached data for a
>         >                 DataContext that no longer exists.  (This
>         >                 may not be the case anyway; I haven't fully
>         >                 read and understood the code.  I'm just
>         >                 trying to make clear that preferring shared
>         >                 caches isn't an open and shut easy
>         >                 decision.)
>         >                 
>         >                 Thanks,
>         >                 - Jon 
>         >                 
>         >                 
>         >                 
>         >                 On Thu, 2009-05-14 at 16:55 +0200, Giacomo
>         >                 Tesio wrote: 
>         >                 
>         >                 > I've two need:
>         >                 > - Thread Safety of static caches: should
>         >                 > be done for QueryCache, but Jon have
>         >                 > encountered a strange (unreproduced on
>         >                 > tests) bug while working on NerdDinner. If
>         >                 > no other static cache exists they are ok.
>         >                 > 
>         >                 > 
>         >                 > - XmlMappingSource working correctly: now,
>         >                 > associations are not loaded from external
>         >                 > mappings. This fix require IDataMapper and
>         >                 > DataMapper modifications and DataContext
>         >                 > fixes.
>         >                 > 
>         >                 > 
>         >                 > The first is absolutelly needed (but
>         >                 > should yet work right, just missing true
>         >                 > multithread test on multi core machines).
>         >                 > The second I think is really important,
>         >                 > but require a bit of work.
>         >                 > 
>         >                 > 
>         >                 > 
>         >                 > Giacomo
>         >                 > 
>         >                 > 
>         >                 > 
>         >                 > On Thu, May 14, 2009 at 4:28 PM, Sharique
>         >                 > <[email protected]> wrote:
>         >                 > 
>         >                 >         
>         >                 >         Hi,
>         >                 >         It has been almost 1 year since
>         >                 >         last release, I think it is time
>         >                 >         make
>         >                 >         a new release (guess 0.19) . If
>         >                 >         there is any blocking issue, pls
>         >                 >         put
>         >                 >         it here for discussion. So that we
>         >                 >         can resolve it quickly.
>         >                 >         
>         >                 >         --
>         >                 >         Sharique
>         >                 >         
>         >                 > 
>         >                 > 
>         >                 > 
>         >                 > 
>         >                 
>         >                 
>         >                 
>         >                 
>         >         
>         >         
>         >         
>         > 
>         > 
>         > 
>         > 
>         
>         
>         
>         
>         
> 
> 
> 
> > 

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"DbLinq" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/dblinq?hl=en
-~----------~----~----~----~------~----~------~--~---

[dblinq] Re: new release ?

Reply via email to