[dblinq] Re: new release ?

Pascal Craponne Sat, 16 May 2009 06:58:57 -0700

I took a look at the history, and it is now just a string comparison.
Was it failing before the end of april (27th)?


On Sat, May 16, 2009 at 15:50, Pascal Craponne <[email protected]> wrote:

> Hi,
> right, two expressions at different places have a different signature (hash
> and reference). Even the same expression called twice has a difference
> signature. This is why when I wrote the Expression comparer I chosed the
> option to compare the comparable contents.
> I'm pretty sure it was working at that time.
> Someone probably added a new comparison, which now causes comparisons to
> fail.
>
> Do we have simple samples of comparison returning an equality where it
> should not?
> Do we also have opposite samples?
>
> I don't like the ToString() approach, since it is not proven to be 100%
> reliable (I'm not sure all sub expressions generate something and something
> unique where converted to a string).
>
> Pascal.
>
> On Sat, May 16, 2009 at 13:55, Jonathan Pryor <[email protected]> wrote:
>
>>  Behold the actual underlying problem to why QueryCache is broken:
>>
>> var id = "foo";
>> var a = db.Customers.Where(c => c.CustomerID == id);
>> id = "bar";
>> var b = db.Customers.Where(c => c.CustomerID == id);
>> Console.WriteLine("HashCodes: a={0}; b={1}", a.GetHashCode(), 
>> b.GetHashCode());
>> Console.WriteLine("a == b? {0}", a.Equals(b));
>>
>>  Output while running under .NET:
>>
>> HashCodes: a=38068051; b=17286255
>> a == b? False
>>
>>  In short, it is *not* appropriate to use an expression tree as a key
>> into a Dictionary, as they do not demonstrate value equality properties.
>>
>> QueryCache is broken (and effectively unfixably broken) because the only
>> useful key into the cache that we have...can't actually be used as a key.
>>
>> (Though I suppose we could instead the result of ToString() as the key
>> into the cache, but more on that below.)
>>
>> However, just using ToString() won't fix the unit test breakage I wrote
>> yesterday.  It will instead just make the *existing* (un-patched)
>> ReadTest.A5_SelectSingleOrDefault() test fail (for precisely the same
>> "input parameter values are *also* cached" issue I referred to
>> yesterday).
>>
>> Finally, I'd like to put try one last effort at pointing out why
>> CompiledQuery.Compile() is a better approach.  Executing a Linq-to-SQL
>> query takes 3 steps:
>>
>>    1. Create the expression tree (done by the compiler and at runtime via
>>    Queryable extension methods).
>>    2. Convert the expression tree into SQL code (done by DbLinq).
>>    3. Execute the SQL against the database.
>>
>> The current QueryCache approach only caches the result of (2), and if
>> you've ever profiled expression trees you'll find that (1) can also be quite
>> slow (often unexpectedly so).  (Which is why, while using ToString()would 
>> "fix" the key lookup over using the expression tree itself, still
>> isn't an ideal fix: ToString() requires traversing the (possibly quite
>> large) expression tree, and would thus add additional overhead just to
>> perform the cache lookup.)
>>
>> Meanwhile, CompiledQuery.Compile() allows caching *both* steps (1) *and*(2). 
>>  There is no need for the delegate returned by
>> CompiledQuery.Compile() to re-create the expression tree, nor to re-parse
>> it, which saves time (which is why CompiledQuery exists in the first
>> place).
>>
>> *Please* consider implementing CompiledQuery.
>>
>> - Jon
>>
>>
>> On Sat, 2009-05-16 at 09:33 +0200, Giacomo Tesio wrote:
>>
>> AFAIK the closures are a sort of "pointer" to the value (actually to a
>> wrapper of the context that is used).
>>
>>
>>
>>  I suppose that in this case, someone else (like the CLR) would update the
>> closures values so that the yet compiled query would find the correct values
>> in the environment.
>>
>>
>>
>>
>>
>>  I promise I'll take a look to the QueryCache again monday, but I don't
>> believe we could actually remove the cache, since the DbLinq's performance
>> would make it unusable.
>>
>>
>>
>>  The problem here is understanding fully the CLR managing of closures and
>> understanding why at different time you get different closures. We could
>> also include the closures in the expression comparisation (which actually
>> work only on the string rappresentation of the expression itself).
>>
>>
>>
>>
>>
>>  Giacomo
>>
>>  PS: as to the connected datacontext approach: I think that long lived
>> datacontext should not be "connected", so ours will have
>> ObjectTrackingEnable == false.
>>
>>
>>
>>
>>
>>  On Fri, May 15, 2009 at 7:34 PM, Jonathan Pryor <[email protected]> wrote:
>>
>>
>>  I think the correct approach is to use CompiledQuery, for precisely the
>> cache management reasons I originally outlined.
>>
>> Though long-lived DataContexts would be problematic, as you point out, as
>> you'd effectively clone the entire DB into memory given enough time and
>> queries...  I don't see how a connected database approach (which
>> DataContext follows) could behave otherwise.
>>
>> Does DbLinq support compiled queries?  find says:
>>
>> $ find src -name CompiledQuery.cs
>> src/DbLinq/System.Data.Linq/CompiledQuery.cs
>>
>>  Survey says...No.  I believe *we should*.
>>
>> I'm not sure what you mean by our queries being static.  Once we cache a
>> query, we don't change it anymore.  (If we did, we'd be altering the
>> original query, which would be...bad.)
>>
>> I'm also not sure what you're getting at with closures.
>>
>> The way I envision CompiledQuery working is very similar/identical to how
>> your existing query caching works, except instead of storing the
>> SelectQuery type within QueryCache, it would instead be stored within the
>> delegate returned from CompiledQuery.Compile().  Thus, the user actually
>> deals with pre-parsed query statements, and DbLinq doesn't need to re-parse
>> the LINQ statement again (as the delegate returned by
>> CompiledQuery.Compile() stores the pre-parsed SelectQuery instance).
>>
>> I'm not sure how much work this would take, but I'm hopeful that it
>> wouldn't be too much work.
>>
>> *However*, this does bring up two related issues.
>>
>> 1. I still need to figure out wtf is going on with the NerdDinner caching
>> bug I was seeing last week (and better, how to reproduce in the unit tests
>> so that you can take a better look at it).
>>
>> 2. SelectQuery.GetCommand() gives me *really* bad feelings, because:
>>
>>    1. It takes no arguments.
>>    2. It calls InputParameterExpression.GetValue() with no parameters.
>>    3. (A) and (B) together imply that, even though the underlying 
>> SELECTtakes named parameters (yay), there's no way to actually provide 
>> them/alter
>>    them for the current SelectQuery instance (wtf?).
>>
>> I *think* this is why (1) fails for me (but again, I still need to
>> debug).
>>
>> In any event, it makes no sense to me at all.  The point to having a
>> SELECT with parameters is so that you can cache the expression itself but
>> vary the parameters.  But since we're not providing any parameters, the
>> parameters can't vary.  So...
>>
>> It makes my head hurt, if nothing else.
>>
>> I would instead expect AbstractQuery.GetCommand() to take an 'object[]
>> parameters' argument (or similar) so that we can cache the actual select
>> statement w/o associated parameters.  This would also dovetail nicely with
>> the semantics CompiledQuery.Compile(), as you can provide parameters to
>> the expression you're compiling:
>>
>> var pepleWithLastName = CompiledQuery.Compile(
>>      (PeopleDb db, string lastName, int start, int count) =>
>>              (from p in db.People
>>               where p.LastName == lastName
>>               select p)
>>              .Skip(start)
>>              .Take(count));
>> foreach (p in peopleWithLastName(myDB, "Foo", 0, 1)) ...
>>
>>  Alas, CompiledQuery.Compile() will only create delegates accepting 4
>> parameters, but there are workarounds...
>> - Jon
>>
>>
>>
>>
>> On Fri, 2009-05-15 at 16:35 +0200, Giacomo Tesio wrote:
>>
>> I'll really hope is it right... :-|
>> It would be a great problem otherwise.
>>
>> AFAIK, the datacontext rappresent a UnitOfWork. If the unit of work is
>> long as the application lifetime, than keeping it alive is right.
>> But in an internet application delivered via http, I've got my doubt.
>>
>> If it's not readonly, the tracked entities would become an unaligned copy
>> of the full database... in memory.
>>
>>
>> That said, does DbLinq support compiled queries?
>>
>> The problem with this approach, would be that our queries are not static:
>> we progressively add clausoles to IQueryable<T>.
>> And, how to handle with closures?
>>
>>
>> Giacomo
>>
>>
>>
>> On Fri, May 15, 2009 at 1:37 PM, Jonathan Pryor <[email protected]> wrote:
>>
>> I think you'll find that you're doing it wrong. :-)
>>
>> First, I'm not sure that the assertion that all apps have short-lived
>> DataContexts is correct.  That's certainly not the case for NerdDinner,
>> which has only one DataContext for the lifetime of the app.  Many apps
>> may have short-lived DataContexts, but many won't.
>>
>> Secondly, and primarily, Microsoft doesn't do implicit query caching.
>> They do *explicit* query caching:
>>
>>
>> http://blogs.msdn.com/ricom/archive/2008/01/11/performance-quiz-13-linq-to-sql-compiled-queries-cost.aspx
>>
>> http://blogs.msdn.com/ricom/archive/2008/01/14/performance-quiz-13-linq-to-sql-compiled-query-cost-solution.aspx
>>
>> For example:
>>
>> var fq = CompiledQuery.Compile
>> (
>>     (Northwinds nw) =>
>>             (from o in nw.Orders
>>             select new
>>                    {
>>                        OrderID = o.OrderID,
>>                        CustomerID = o.CustomerID,
>>                        EmployeeID = o.EmployeeID,
>>                        ShippedDate = o.ShippedDate
>>                    }).Take(5)
>> );
>>
>>  The result of CompiledQuery.Compile is a pre-compiled, pre-analyzed
>> query, which the *user* is responsible for caching and dealing with.
>> Query caches are *not* part of DataContext itself, *precisely because*it's a 
>> recipe for a giant memory leak.
>>
>> (Consider a fictional app which uses Linq-to-SQL once at startup, or
>> otherwise very infrequently.  The DbLinq approach would assure that the
>> original cached queries would never be freed; it would be a permanent memory
>> tax on the app.)
>>
>> I would *strongly *suggest that you follow Microsoft's approach, drop the
>> DataContext query caching, and use CompiledQuery instead.
>>
>> - Jon
>>
>>
>>
>> On Fri, 2009-05-15 at 10:25 +0200, Giacomo Tesio wrote:
>>
>> OFFTOPIC about the interfaces
>> IQueryCache is an example of an interface which must not be removed. This
>> way if a skilled programmer would really need it, it could reimplement it to
>> better fill it's requirement.
>>
>> Even if using ReaderWriterLockSlim, thread safety has a cost: some one
>> could require speed rather than thread safety.
>> Than it should be possible to reimplement it.
>>
>> Another use of such an interface could be to move the cache out of the
>> appdomain (in dedicated cache servers) and share them among servers. This
>> would also make the cache livecicle longher than the iis application.
>> I'm not a fan of this solution (I'm not sure this would lead in better
>> performances), but in an enteprise environment like ours, it had to be
>> possible to take such a decision later.
>>
>> That's why I think that good interfaces are a good thing: we should not
>> decide how DbLinq as to be used (tecnologically speaking of course).
>>
>> Having GOOD internal interfaces allow better flexibility in the long run,
>> and produce a better open source product.
>>
>>
>> BAD interfaces, on the other hand, reduce flexibility and improve
>> developments effort, but I think that they always underling a wrong analisis
>> or design (if not a completely missing one).
>>
>>
>> I encounter often .NET programmers talking against interfaces. But till
>> now, I've always noticed they are talking about wrong interfaces they have
>> designed bottom up. It could took much time to explain a Microsoft .NET
>> developer that the problem are not the interfaces, the problem is their
>> design.
>>
>> More or less like explaing them that a Domain Model IS Object Orientation,
>> not a way of doing Object Orientation.
>> Or to explain them that a object oriented language does not lead by itself
>> to object oriented software...
>>
>>
>> Ok... I don't like Microsoft. :-D
>>
>>
>>
>> Giacomo
>>
>>
>> On Fri, May 15, 2009 at 10:00 AM, Giacomo Tesio <[email protected]> wrote:
>>
>> That's a good question, i think.
>>
>> QueryCache is a cache of generated queries for each expression tree
>> evaluated.
>> It actually has to be static (at least thread static, but this would
>> multiply the memory usage per number of threads, also reducing the hits and
>> reducing livetime to the thread one) to improve the hits, since in the most
>> common DataContext use case, it rappresents a unit of work and have short
>> live.
>> If the QueryCache livecicle would match the DataContext one, probably it
>> would have no reason to exists.
>>
>> In "our" DbLinq use case, a readonly DataContext is used from all threads
>> and has a long live (the AppDomain one), while single unit of works actually
>> are created per request or on a per need basis.
>>
>> The global readonly one, has to be no "instance caches" at all (no object
>> tracking for example, and I hope there are no other caches... but actually I
>> should indagate this more), but still need a QueryCache becouse it could
>> share its yet parsed expression tree among threads.
>>
>> Moreover all the DbLinq DataContexts would benefit from such a static
>> queries, greatly increasing the performances (I hope! ! ! :-D).
>>
>> Other caches (like the EntityTrakings one) must not be static since they
>> are conceptually linked to the DataContext that fill and use them.
>>
>>
>>
>>
>> Giacomo
>>
>>
>>
>>
>> On Thu, May 14, 2009 at 5:19 PM, Jonathan Pryor <[email protected]> wrote:
>>
>> This is bound to be a stupid/silly question, but why do the caches need to
>> be static?  Static data is effectively global, i.e. a GC root in and of
>> itself, and thus will never be collected.  Even with a good policy, it's
>> possible that this could use more memory than people would expect.
>>
>> See also:
>>     
>> http://blogs.msdn.com/oldnewthing/archive/2006/05/02/588350.aspx<http://blogs.msdn.com/oldnewthing/archive/2006/05/02/588350.aspx>
>>     http://blogs.msdn.com/ricom/archive/2004/01/19/60280.aspx
>>
>> Is there *really* a need for a cache that's static (i.e. shared amongst
>> all DataContext instances)?  Or can it just be non-static and attached to
>> the DataContext (which would also remove all thread safety requirements).
>>
>> Put another way, with non-shared caches if the DataContext gets collected
>> then the cache is also collected, thus providing a natural mechanism to
>> clear the cache.  With shared (static) caches, they're not connected to
>> the DataContext, and thus it could be holding cached data for a
>> DataContext that no longer exists.  (This may not be the case anyway; I
>> haven't fully read and understood the code.  I'm just trying to make clear
>> that preferring shared caches isn't an open and shut easy decision.)
>>
>> Thanks,
>> - Jon
>>
>>
>>
>> On Thu, 2009-05-14 at 16:55 +0200, Giacomo Tesio wrote:
>>
>> I've two need:
>> - Thread Safety of static caches: should be done for QueryCache, but Jon
>> have encountered a strange (unreproduced on tests) bug while working on
>> NerdDinner. If no other static cache exists they are ok.
>>
>>
>> - XmlMappingSource working correctly: now, associations are not loaded
>> from external mappings. This fix require IDataMapper and DataMapper
>> modifications and DataContext fixes.
>>
>>
>> The first is absolutelly needed (but should yet work right, just missing
>> true multithread test on multi core machines).
>> The second I think is really important, but require a bit of work.
>>
>>
>>
>> Giacomo
>>
>>
>>
>> On Thu, May 14, 2009 at 4:28 PM, Sharique <[email protected]> wrote:
>>
>>
>> Hi,
>> It has been almost 1 year since last release, I think it is time make
>> a new release (guess 0.19) . If there is any blocking issue, pls put
>> it here for discussion. So that we can resolve it quickly.
>>
>> --
>> Sharique
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> >>
>>
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"DbLinq" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/dblinq?hl=en
-~----------~----~----~----~------~----~------~--~---

[dblinq] Re: new release ?

Reply via email to