On Saturday, 10 January 2015 at 18:31:18 UTC, Paolo Invernizzi wrote:
On Saturday, 10 January 2015 at 17:31:42 UTC, DaveG wrote:
On Saturday, 10 January 2015 at 13:19:19 UTC, Martin Nowak wrote:
Here is a sketch for an optimal solution. I'm actually eagerly waiting that someone finally implements it.

http://dpaste.dzfl.pl/cd375ac594cf

I would also have to sell the idea of writing an ORM which is certainly not on the roadmap, but this will certainly help my argument.

Maybe not, something simpler than a full ORM should be compelling also.

I guess you know about the ORM Vietnam [1], but also this [2] can be of some help in selling a simple D solution.

I would like to see, someday, something in D that:

 - can check at compile time the syntax of SQL;
- can check at compile time the SQL query statement against the current DB schema; - can read the output of a DB schema dump at CT, and parse it into what is needed for the previous points (more complicated);

The first point should be easy today, the second and the last one involve more work...

[1] http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
[2] http://wozniak.ca/what-orms-have-taught-me-just-learn-sql
---
Paolo

I have no intention of writing anything as massive as Entity Framework or Hibernate. We have been successful over the past 4 years with just a small collection of functions to reduce some of the pain (and redundancy) in writing a lot of dynamic SQL. Now that we have an opportunity to start fresh we have a chance to do something better.

The traditional problems with ORMs in general are well known and these are the reasons why I have never used one in production.

1. Complexity. You basically need to learn an entire new language (sometimes literally). This is an investment which can be worth it if the abstraction is successful. The following problems are why I think the investment is not worth it.

2. Limitations. Unfortunately too often you need to drop in to SQL to really get things done. This alone is a non-starter. If I need to bypass the abstraction to do anything really interesting or complex, it has failed. Sometimes (usually) this is for performance, other times it's because there is simply no way (or it's too complicated) to express what I want through the abstraction.

3. Compilation/Translation. The time to translate commands to SQL (or whatever backend) can be a high price. Most ORMs do some type of caching now which is generally sufficient. In D most of the work can be done at compile time which is even better.

4. Unnecessary Data. Greedy data retrieval is way to common, the default is usually to get everything. For small queries and data sets you can write it off as "not a problem", but when your model gets large and interconnects, this can be catastrophic. Again, thanks Martin for the clever basis for a solution in D.

5. DB Performance. The efficiency of the SQL that is actually generated. People seem to focus on this because the generated SQL is generally quite verbose. Interestingly, in my experience, this is often the smallest performance problem because the query optimizer (at least in SQL Server with good indexes and statistics) will generate the same execution plan regardless. This is also a code gen problem that can be tweaked without breaking user code.

You may have noticed that 4 of 5 problems are about performance. That's because, at least in our case, it is that important and it is that much of a problem. Current ORMs often look great, but in my experience, the price is always to high. Some "micro-ORMs" avoid the performance problems, but they do so by sacrificing most of the features (you still have to write raw SQL for example). Some of the problems are inherit to solution and cannot be "solved", but they can be reduced.

For a long time I thought some of these problems where fundamental and had basically written off the concept of ORMs [see: Vietnam of Computer Science]. The good news is most of the problems appear to be solvable. #1 is unavoidable obviously there will be something new (whether it's a DSL or just an API)
#2 is really dependent on the other problems and implementation.
#3 is "just" implementation.
#4 has a conceptual solution, now it's "just" implementation.
#5 does not have a solution because it will depend on the backend, but I think it's reasonable to expect a solution that works for almost all cases. It will be impossible to know without testing.

One final note. You may have noticed I didn't mention the schema syncing problem (keeping database and code in sync). There was a time I would have said that was essential and while it would be nice in a perfect world, I'm comfortable keeping them in sync manually (or semi-manual with scripts). I can generate a bunch of classes from an existing database fairly easily and when I change a table I can manually update a class. If I was writing SQL directly I would have to update my query, this is really no different. Doing validation in unit tests is perfectly acceptable to me.


Sorry for long post.
-Dave

Reply via email to