> > Whoa, who's the one who doesn't understand what system design > > is all about? :-S > > Ouch. That hurts. It really does. But---fortunately---you > misunderstood me. ;)
Hmmm, what I read in your statement was not that vague and I have the feeling I understood you quite well. > Modeling is hard. But let's assume that we've managed to do a > good job here and that we came up with a very good model for > a specific domain. > Furthermore, let's assume that we've used Halpin's ORM, with > which I'm quite familiar actually ;). good! :) > From this model, we come up with a) an object-model design > for our application layer, and b) a relational-database > schema for our data layer. > These are really two distict tasks! Consider for example how > many-to-many relationships are modelled in database schemas: > we need auxilliary tables there. Well, there will be no such > things as auxilliary classes in our application-layer object > model. So, our basic model is still good, but different > layers need different incarnations of it. As I also replied to Jeff, this is not a correct conclusion. Order - Product is an m:n relationship however the intermediate table, OrderRow is in fact a relation, an objectified relationship. (relation vs. relationship, in Dutch / German we use the same word for these, which is a source for a lot of misunderstandings, I make this mistake a lot too, I think even in this email :D) There are of course m:n relationships which are not objectified. These result in entities which are not considered an entity (but by definition they are). Let's say you write code with this setup. You then have some m:n relationships which have an intermediate table which is seen as an entity, and other m:n relationships which are not seen as an entity. Let's go through a simple example. Employees and Departments. Say these have an m:n relationship so employees can work in multiple departments (they really work hard :)). We then need an intermediate table, which is not really an entity, and this is DepartmentEmployees. In our code, when we want to add an employee to a department, we do this (pseudocode) // create new empty employee entity object EmployeeEntity newEmployee = new EmployeeEntity(); // fill in properties of this employee myDepartment.Employees.Add(newEmployee); // save the department and with that the employee, in 1 transaction myDepartment.Save(); Nice code, tiny, clean. Now, the manager thinks he needs to know when each employee started at which department. To model this correctly, you have to objectify the department-employee relationship into a relation, and add a non-pk, mandatory attribute 'StartDate'. Now, go back to the codesnippet I just posted. It definitely will not work, you need to add StartDate as well, as it is mandatory. You also can't add the employee directly to Employees in the department object, as it is related via a 3rd entity, so to reach the department from an enmployee, the developer has to do something like: myEmployee.DepartmentEmployees[index].Department This example is still obvious, as StartDate is mandatory. But lets say it is first NON mandatory at first. You can keep the code as it is posted first, because StartDate then will be simply NULL. However when StartDate becomes mandatory, you have to alter the code as it will throw a 'cant be null'-exception or similar for the StartDate field. What does this mean? Well, you have to know EXACTLY which non-pk fields in an objectified relation are mandatory and which are not. Instead of moving away from the database you move closer to the database. It also introduces inconsistency: some m:n relationships have to be treated this way, others in a total different way. You can avoid this, by declaring m:n relations read-only and always recognize intermediate entities. This greatly degrades complexity of the code, as it doesn't require you to know when you can do Employees.Add(newEmployee) and when you can't, it is ALWAYS the same. You are of course fully entitled to write complex software, however I've learned during the years that when you spend a little more time on how to make things simpler instead of to make things more complex, the application will be more maintainable and easier to build. > No problem so far: nothing but 'good design'. Over to 'bad design' ... no, no good design. A lot of m:n relationships run through objectified relationships. Every table with at least one pair of 2 FK fields is an intermediate table in an objectified m:n relationship! This then results in a big inconsistency in your software: SOME intermediate tables in m:n relations are saved automatically and OTHERS are not. Do you know, by looking at Department.Employees, if that container contains m:n related objects which are related using an objectified relationship or not? No you can't tell, you need database information, very detailed information. Furthermore, it makes code inconsistent: at one time it has to be written like this, at another time, because deep below the non-pk attribute suddenly now is mandatory, your code has to be written totally different! I'm sorry, but I find that kind of inconsistency, which causes code breaks far away from the database (you just flip a 'NULL' into 'NOT NULL'), 'bad' design. Consistency, that's what makes software work. Look for schemes that work, always and apply them when appropriate. Always implement functionality ABC as A -> B -> C, and never in another order, so you always know you're doing it correctly if you follow A -> B -> C. Very easy to learn for each developer: learn the steps, take them, and you're done. Less errors, less bugs, less trouble, better software. Remember: no-one is going to laugh at you because you wrote consistent software, only the few people who think HOW is more important than WHY. You know, the people who think it is key to write 'pure' OO and think that by doing that the application is written better, while they're forgetting that by using pure OO you definitely need translation layers along the way, and every translation layer decreases the strength of the connection between the functional design and the actual result in code ("the functionality description in program text"), which then results in weaker maintainability possibilities because it is hard to tell where functionality ABC is implemented as there is no direct 1:1 projection of ABC possible on the code or vice versa. > And this is something I've encounterd in real life many times > now. First of all, it seems tempting to forget about coming > up with a good basic model and just write out a database > schema. I guess I don't need to tell you what's wrong about > that. :) But, even if we could convice our design team to use > the ORM model we've carefully composed, danger is just around > the corner. > I've worked with teams that spent a lot of time in designing > the database and then rushed forward and wrote > application-layer class definitions that were just one-on-one > translations of the database tables. So, there were, for > instance, auxilliary classes showing up to model many-to-many > relationships! This is what I mean by 'being able to seeing > the database through all layers'. This is what I consider > 'bad design'. > I hope you agree with me about this. No I definitely don't agree with this and I hope you now understand why. But don't let that stop you creating inconsistend software which falls apart the second you flip a not null constraint. Look, I spend serious time solving this issue, as I had to solve it in my schema retriever in LLBLGen Pro. I did some tests where I had the intermediate tables hidden and where I didn't had them hidden and checked what would happen if the datamodel changes (and against every law in the universe, a lot of people like to mess with their datamodel on a daily basis as it seems). I could only conclude after those tests and by looking at the code necessary to solve the issue of saving m:n related objects that to keep code consistent and to keep developers focussed and not make them confused because suddenly a construct was totally different due to a non-null non-pk attribute, the intermediate table should be known, always. And why is this all a bad thing? Because it shows the database model? Why is that bad? Do you consider this code bad? -> myOrder.CustomerID = _customerID; and this good: myOrder.Customer = myCustomer; ? If so, you require the fetch of the Customer object to work with the order. However, HOW do you know you have to set the FK field 'CustomerID' in Order to related Order to Customer and to be ABLE to save the order? You can't know that UNLESS you know the facts about the relational model. The underlying code can't invent some customer data if you forget to set it in Order. In other words: a developer needs to know the relational model to work with objects from the database. Software that is written in real projects has to work, always, has to be maintainable easily and be extensible easily now and in the (near) future, perhaps by other people. Solid software is written by keeping things simple (but not simpler! :)) and consistent, so you know what to expect and why and it is never 'suddenly different'. Those are aspects of real-life software development which have top-most priority. A lot of (mostly younger) developers forget about this and think solely in code constructs and eye only a vague goal like 'the best object model' or 'pure OO' or 'the solution without a single line of db related code'. That's fine, as long as their manager knows that and tells them to wake up and use other priorities at work. IMHO, you are too focussed on the low(er) level aspects of software development and want to solve with them high(er) level aspects of software development. What about the great designs in Microsoft Business Framework? Example: SalesOrder. This is a class which aggregates Customer, Order and inside Order, OrderRows. Now you have a new object to work with and it hides relational model aspects for a great deal but not all of them (it still requires knowledge about customer being related to order for example and customer being related to salesorder.) This is however on a level higher than the O/R mapper. It requires a new layer on top of the layer generated / provided by the O/R mapper which solely provides BL objects which are combinations of lower level objects provided by the O/R mapper. Do you really think it matters to these higher order objects if the intermediate table is there? They want it there! so the code in these objects is simple and doesn't have to take into account situations where the intermediate table is 'suddenly hidden'. Consistency and simplicity. Two good friends you want to keep for life, trust me. FB =================================== This list is hosted by DevelopMentorŪ http://www.develop.com Some .NET courses you may be interested in: NEW! Guerrilla ASP.NET, 26 Jan 2004, in Los Angeles http://www.develop.com/courses/gaspdotnetls View archives and manage your subscription(s) at http://discuss.develop.com