Giacomo Tesio and I have been working on profiling DbLinq to see which
parts are slow. After having tried an actual profiler (and abandoned
it, because I'm not genius enough to figure out dotTrace, plus it
crashed when I tried to profile the unit tests), we've come to a new
approach: timestamps [0].
Currently, our timestamps are only present in debug builds, and are of
the format:
#AT(time=X, elapsed=Y) free form text
where time is the total elapsed time, in ticks, from the start of a test
case, while elapsed is the number of ticks that have passed since the
previous #AT message, regardless of free form text. It's stupid, but it
works.
The results likely need more processing (such as gathering the output
for all tests and averaging the time for various steps), but here's the
output for a simple query of CreateDB().Customers.Count():
1 ***** Test_NUnit_MsSql.AnyCount.CountExternal03
2 #AT(time=0000000011, elapsed=0000000011) BaseSetUp()
3 #AT(time=0000108930, elapsed=0000108919) START
DataContext(IDatabaseContext, MappingSource, IVendor)
4 #AT(time=0000457750, elapsed=0000348820) END
DataContext(IDatabaseContext, MappingSource, IVendor)
5 #AT(time=0000457782, elapsed=0000000032)
DataContext.GetTable(typeof(Customer))
6 #AT(time=0000478887, elapsed=0000021105) START
QueryProvider.Execute<Int32>(): Executing expression...
7 #AT(time=0000478942, elapsed=0000000055) START: GetSelectQuery(),
building Expression query
8
9 Call (MethodCallExpression)
10 ..Object: (null)
11 ..Method: Count
12 ..#00000 Constant (ConstantExpression)
13 ....Value: Table(Customer)
14 #AT(time=0000612860, elapsed=0000133918) END: GetSelectQuery(),
building Expression query
15 #AT(time=0000612871, elapsed=0000000011) START: GetSelectQuery(),
building Sql query
16 #AT(time=0000614721, elapsed=0000001850) END: GetSelectQuery(),
building Sql query
17 SELECT COUNT(*)
18 FROM [dbo].[Customers]
19 -- Context: SqlServer Model: AttributedMetaModel Build: 0.19.0.0
20
21 #AT(time=0000729500, elapsed=0000114779) END
QueryProvider.Execute<Int32>(): Executing expression...
22 #AT(time=0000729612, elapsed=0000000112) BaseTearDown()
Things I find truly interesting is that the DataContext constructor is
*slow* (which shouldn't be too surprising, as Adrus Moor already told us
this), but at 348820 ticks to execute it is the most expensive step,
taking over 3x longer than actually executing the SQL query.
This is insane.
Otherwise, the output above shows roughly 5 steps in the evaluation:
1. Create a DataContext instance.
2. Create the Expression tree (via System.Core)
3. Convert the Expression tree into a query.
4. Convert the query into a SQL statement.
5. Execute the SQL.
Step 1 is lines 3-4, taking 348820 ticks.
Step 2 is lines 5-6, taking 21105 ticks. To properly determine the
bounds of this step requires understanding the IL of the method, as
expression tree generation is generated by the compiler. In this case,
the DataContext.GetTable() call is executed *before* the expression tree
is created, while the QueryProvider.Execute<T> method is executed after
the expression tree has been constructed. In this case the expression
tree is virtually non-existent, but this will take more time for more
complex queries, and for actual queries different QueryProvider methods
will be invoked as part of the expression tree generation process.
It's probably safe to say that Step 2 is everything between the end of
the constructor and the start of step 3, most of the time...
Step 3 is lines 7-14, taking 133918 ticks.
Step 4 is lines 15-16, taking 1850 ticks.
Step 5 is lines 17-21, taking 114779 ticks.
With that said, I can return to the topic that started the entire
profiling question: QueryCache. QueryCache, by design, can only cache
the output of steps 3 and 4. As it happens, for this simple case that
looks like it could be a win, as Step 3 is the 2nd most expensive step,
but I wonder if this will be true for more complicated queries.
Thus, a more complicated query:
***** Test_NUnit_MsSql.ReadTest_Complex.F26_DistinctUnion_Count
#AT(time=0000000012, elapsed=0000000012) BaseSetUp()
#AT(time=0000151116, elapsed=0000151104) START
DataContext(IDatabaseContext, MappingSource, IVendor)
#AT(time=0000511830, elapsed=0000360714) END
DataContext(IDatabaseContext, MappingSource, IVendor)
#AT(time=0000525992, elapsed=0000014162)
DataContext.GetTable(typeof(Territory))
#AT(time=0000950630, elapsed=0000424638) START
QueryProvider.CreateQuery<Territory>(Table(Territory).Where(t =>
t.TerritoryDescription.StartsWith("A")))
#AT(time=0000972910, elapsed=0000022280) END
QueryProvider.CreateQuery<Territory>(...)
#AT(time=0000972939, elapsed=0000000029)
DataContext.GetTable(typeof(Territory))
#AT(time=0000975240, elapsed=0000002301) START
QueryProvider.CreateQuery<Territory>(Table(Territory).Where(terr =>
terr.TerritoryDescription.Contains("i")))
#AT(time=0000975545, elapsed=0000000305) END
QueryProvider.CreateQuery<Territory>(...)
#AT(time=0001004777, elapsed=0000029232) START
QueryProvider.CreateQuery<Territory>(value(DbLinq.Data.Linq.Implementation.QueryProvider`1[nwind.Territory]).Concat(value(DbLinq.Data.Linq.Implementation.QueryProvider`1[nwind.Territory])))
#AT(time=0001004930, elapsed=0000000153) END
QueryProvider.CreateQuery<Territory>(...)
#AT(time=0001019645, elapsed=0000014715) START
QueryProvider.CreateQuery<Territory>(value(DbLinq.Data.Linq.Implementation.QueryProvider`1[nwind.Territory]).Distinct())
#AT(time=0001019691, elapsed=0000000046) END
QueryProvider.CreateQuery<Territory>(...)
#AT(time=0001041300, elapsed=0000021609) START
QueryProvider.Execute<Int32>(): Executing expression...
#AT(time=0001055299, elapsed=0000013999) START: GetSelectQuery(),
building Expression query
...
#AT(time=0003884678, elapsed=0002829379) END: GetSelectQuery(),
building Expression query
#AT(time=0003884690, elapsed=0000000012) START: GetSelectQuery(),
building Sql query
#AT(time=0004433815, elapsed=0000549125) END: GetSelectQuery(),
building Sql query
...
#AT(time=0006674933, elapsed=0002241118) END
QueryProvider.Execute<Int32>(): Executing expression...
#AT(time=0006758724, elapsed=0000083791) BaseTearDown()
This output is trimmed, and is the output of
Test_NUnit_MsSql.ReadTest_Complex.F26_DistinctUnion_Count(), which is
considerably more complicated than the prior query.
Here, we have:
Step 1: 360714 ticks
Step 2: 104669 ticks
Step 3: 2829379 ticks
Step 4: 549125 ticks
Step 5: 2241118 ticks
Here, Step 2 takes ~5x longer than the simple case, but Step 3 dwarfs
everything else, taking ~27x longer than the expression tree creation.
This backs up what I've heard Giacomo Tesio mention before, that with
QueryCache disabled the unit tests take longer to execute. With query
generation taking longer than it takes to execute the SQL on the server
(step 5!), I can see why...
I'm still not sure how to fix it though. :-/ I'm philosophically
opposed to a global QueryCache (global data is bad!), and a
per-DataContext QueryCache doesn't make lots of sense.
I thus see two solutions:
1. Figure out why query creation takes so long, and speed it up. (The
same should be done for the constructor, for that matter.)
2. Use a QueryCache, but instead of making it an internal implementation
detail -- meaning the user has ~no control over its lifetime -- make it
something the user can control. This could be done by making QueryCache
a public type, and adding a DataContext.QueryCache property. If the
user wants to use a cache, they can create a global cache and provide it
to the DataContext as part of their initialization steps.
This gives control of the cache to the user, allowing them to easily
monitor size and provide cache policy, without creating an
internal-to-DbLinq-memory-leak-in-waiting.
- Jon
[0] Perhaps unsurprisingly, this is exactly what I had Adrus Moor do
when he found out that AttributeMappingSource was slow...
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"DbLinq" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/dblinq?hl=en
-~----------~----~----~----~------~----~------~--~---