Hi, Frank: I am probably missing some of your points as I could not fully grasp your previous comments. I shall start with two fundamental concepts in relational DBMS (R-DBMS): normalization in database design, and set notion based SQL construct. Details of such subjects are plenty in text books as well as on web.
1. normalization in database design The first example (where only 2 fields were shown) in my previous message is meant as a trivial illustration of flawed data, violating the normalization guideline in database design. The flaw is from the perspective of normalization of database design. -- begin quote from Ray Jahn ... first example, flawed data ... field 03 | field 07 (number) | (text) -------------------- 1 | name 04 1 | name 04 2 | name 07 -- end quote from Ray Jahn -- begin quote from Frank Schönheit Speaking strictly, it's not even flawed data. At least the backend – the place where the concrete record data is written to some file – can distinguish those records, at least by relative position in the file. Now if the backend would expose the possibility for distinction to its clients, all would be fine. (Which btw. leads us to the potential driver capability of bookmarks: driver/backend-dependent unique row values, which are not interpreted by the clients, only used for identifying records when talking to the database.) -- end quote from Frank Schönheit I am not sure what your point is here. Normalization has been touted by experts in R-DBMS designs to get us out of unnecessary troubles. A normalized database design would not lead to this flawed data. There is no need to deal with such a problem as it will not occur in a normalized database. For non-normalized data set, one would of course need a way to differentiate between duplicates which are bound to happen. One asks for non-normalized database design, and one is naturally rewarded with its inherent headaches. Does OO Base consider normalization less of a value at GUI level? 2. set notion based SQL construct The second example (where 3 fields were shown) in my previous message is independent of the first example, albeit the sameness in some field names and contents. -- begin quote from Ray Jahn Now assume a properly designed database. No more flawed data or design. Every record can be uniquely identified and hence operated upon (insert, retrieve, revise, etc.). A more likely scenario involves views (in database parlance) where a subset of all available data fields are displayed according to selection rules (the SQL Select statements) involving additional fields not included in the displayed results. This is where GUI plays its major role. The views display only partial information, by choice of users, of the data. A contrived example is shown below, based on your previous example. field 03 | field 07 | field 09 (number) | (text) | (number) ------------------------------ 1 | name 04 | 17 1 | name 04 | 23 2 | name 07 | 23 When all three fields are present, every record is unique. If one looks at only two fields as one might in a SQL Select result, we return to the previous scenario of APPARENT duplication in PARTIAL information. In other words, the perceived problem of duplication does not exist. No one would expect in logical mind to change (insert, update, etc.) the data without specifying the other fields. Or at least I expect data analysts to know their data. Or at least we should realize that a specific vehicle cannot be distinguished by color alone (too few attributes). -- end quote from Ray Jahn -- begin quote from Frank Schönheit But even then, how to execute a user request such as set "field 07" to "name 08" in the row where "field 03" is "1" and "field 09" is "23"? Though the composed view does contain only unique records, the constituting tables don't. Updates cannot include the fields from the first table - which does not have "unique" records -, but only fields from the second table. So there's effectively no difference to only updating the second one. -- end quote from Frank Schönheit I am confused here by your comments. I shall assume that the second paragraph of your comments results from an assumption that the two examples were connected. The concern would not exist in a normalized database design. So I shall skip this part. Now about the first paragraph on how to update the data. In R-DBMS, it has been designed on set theory that data are manipulated as a set. The set may contain only 1 member satisfying the criteria. It is still a set. SQL language design has followed the same spirit. This needs some effort to get used to for programmers who are so engrained in notions of iterators in Java, C++, C, etc., which are a variant of pointers in C and C++. I assume that OO Base always interacts with the backend DBMS engine through SQL CUI. Or is this assumption wrong? The answer by SQL to your question in the first paragraph would effectively be (save the actual SQL statements): Change the content of "field 07" to "name 08" for all records whose associated "field 03" is "1" and "field 09" is "23". The key notion here is "for all records". Notice the absence of position info. in this statement. In set based operations, no info. of static position (index) of any sort (row, line, character, etc.) is passed to DBMS engine. In SQL, The DBMS engine does not take any position info. from GUI. Nor does it provide the position info. back to GUI. The GUI does not need to, and does not, provide position info. to manipulate data, because they are treated as sets. The GUI is free to construct all kinds of position info. (bookmark, index, etc.) for its own use. Surely the DBMS engines also construct their own internal position info. as they work through the dataset (need to support rollback, etc., in transactions). However, no position info., regardless of the origin, is ever exchanged. One reason for this set approach is to support transactions, rollback, and multiple writers and readers. The DBMS engine is free to change the location of any records in a transactional system with multiple simultaneous clients, some being writers and some being readers. The notion of position info. is like pointers in C and C++ languages, if this comparison helps. Think about the consequence and why the Java language designers have chosen to leave out the concept of pointers. (No flame please. All programming languages have merit and limitation.) So I am not certain about the purpose of mandatory position info. (unique index) with table design imposed by OO Base GUI. For GUI own advantage, OK. GUI should be able to build it behind the scene without the onus on the user. For manipulating dynamic data in transactional databases through direct positional access on the DBMS side? No. Not possible at present in SQL and R-DBMS due to data integrity concern. Not likely desirable either for transactional systems. Proposals of data access through position info. require companion tools and guidelines for ensuring data integrity, just like the dealing of pointers in C and C++ programming languages. Mere the ability of data access through position info. is a guarantee for disaster in data integrity. Unless OO Base GUI is planning on something different from the transactional R-DBMS market. -- begin quote from Frank Schönheit Now if the backend would expose the possibility for distinction to its clients, all would be fine. (Which btw. leads us to the potential driver capability of bookmarks: driver/backend-dependent unique row values, which are not interpreted by the clients, only used for identifying records when talking to the database.) -- end quote from Frank Schönheit Please let the community know which DBMS engine product has been re-designed as suggested to allow direct data access through position info. Database design experts will weigh in on such design whether it constitutes vulnerability or added value. It should be interesting to find out if those DBMS engine products gain or lose market share. Sincerely, Ray
