Hi, Frank:

I am probably missing some of your points as I could not fully grasp
your previous comments.  I shall start with two fundamental concepts
in relational DBMS (R-DBMS): normalization in database design, and set
notion based SQL construct.  Details of such subjects are plenty in
text books as well as on web.

1. normalization in database design

The first example (where only 2 fields were shown) in my previous
message is meant as a trivial illustration of flawed data, violating
the normalization guideline in database design.  The flaw is from the
perspective of normalization of database design.

-- begin quote from Ray Jahn

... first example, flawed data ...

field 03 | field 07
(number) | (text)
--------------------
1        | name 04
1        | name 04
2        | name 07

-- end quote from Ray Jahn

-- begin quote from Frank Schönheit

Speaking strictly, it's not even flawed data. At least the backend –
the place where the concrete record data is written to some file – can
distinguish those records, at least by relative position in the file. 
Now if the backend would expose the possibility for distinction to its
clients, all would be fine. (Which btw. leads us to the potential
driver capability of bookmarks: driver/backend-dependent unique row
values, which are not interpreted by the clients, only used for
identifying records when talking to the database.)

-- end quote from Frank Schönheit

I am not sure what your point is here.  Normalization has been touted
by experts in R-DBMS designs to get us out of unnecessary troubles.  A
normalized database design would not lead to this flawed data.  There
is no need to deal with such a problem as it will not occur in a
normalized database.  For non-normalized data set, one would of course
need a way to differentiate between duplicates which are bound to
happen.  One asks for non-normalized database design, and one is
naturally rewarded with its inherent headaches.  Does OO Base consider
normalization less of a value at GUI level?

2. set notion based SQL construct

The second example (where 3 fields were shown) in my previous message
is independent of the first example, albeit the sameness in some field
names and contents.

-- begin quote from Ray Jahn

Now assume a properly designed database.  No more flawed data or
design.  Every record can be uniquely identified and hence operated
upon (insert, retrieve, revise, etc.).

A more likely scenario involves views (in database parlance) where a
subset of all available data fields are displayed according to
selection rules (the SQL Select statements) involving additional
fields not included in the displayed results.  This is where GUI plays
its major role.  The views display only partial information, by choice
of users, of the data.  A contrived example is shown below, based on
your previous example.

field 03 | field 07 | field 09
(number) | (text)   | (number)
------------------------------
1        | name 04  | 17
1        | name 04  | 23
2        | name 07  | 23

When all three fields are present, every record is unique.  If one
looks at only two fields as one might in a SQL Select result, we
return to the previous scenario of APPARENT duplication in PARTIAL
information.  In other words, the perceived problem of duplication
does not exist.  No one would expect in logical mind to change
(insert, update, etc.) the data without specifying the other fields. 
Or at least I expect data analysts to know their data.  Or at least we
should realize that a specific vehicle cannot be distinguished by
color alone (too few attributes).

-- end quote from Ray Jahn

-- begin quote from Frank Schönheit

But even then, how to execute a user request such as set "field 07" to
"name 08" in the row where "field 03" is "1" and "field 09" is "23"?

Though the composed view does contain only unique records, the
constituting tables don't. Updates cannot include the fields from the
first table - which does not have "unique" records -, but only fields
from the second table. So there's effectively no difference to only
updating the second one.

-- end quote from Frank Schönheit

I am confused here by your comments.  I shall assume that the second
paragraph of your comments results from an assumption that the two
examples were connected.  The concern would not exist in a normalized
database design.  So I shall skip this part.

Now about the first paragraph on how to update the data.  In R-DBMS,
it has been designed on set theory that data are manipulated as a set.
 The set may contain only 1 member satisfying the criteria.  It is
still a set.  SQL language design has followed the same spirit.  This
needs some effort to get used to for programmers who are so engrained
in notions of iterators in Java, C++, C, etc., which are a variant of
pointers in C and C++.

I assume that OO Base always interacts with the backend DBMS engine
through SQL CUI.  Or is this assumption wrong?

The answer by SQL to your question in the first paragraph would
effectively be (save the actual SQL statements):

Change the content of "field 07" to "name 08" for all records whose
associated "field 03" is "1" and "field 09" is "23".

The key notion here is "for all records".  Notice the absence of
position info. in this statement.  In set based operations, no info.
of static position (index) of any sort (row, line, character, etc.) is
passed to DBMS engine.  In SQL, The DBMS engine does not take any
position info. from GUI.  Nor does it provide the position info. back
to GUI. The GUI does not need to, and does not, provide position info.
to manipulate data, because they are treated as sets.  The GUI is free
to construct all kinds of position info. (bookmark, index, etc.) for
its own use.  Surely the DBMS engines also construct their own
internal position info. as they work through the dataset (need to
support rollback, etc., in transactions).  However, no position info.,
regardless of the origin, is ever exchanged.

One reason for this set approach is to support transactions, rollback,
and multiple writers and readers.  The DBMS engine is free to change
the location of any records in a transactional system with multiple
simultaneous clients, some being writers and some being readers.  The
notion of position info. is like pointers in C and C++ languages, if
this comparison helps.  Think about the consequence and why the Java
language designers have chosen to leave out the concept of pointers. 
(No flame please.  All programming languages have merit and
limitation.)

So I am not certain about the purpose of mandatory position info.
(unique index) with table design imposed by OO Base GUI.  For GUI own
advantage, OK.  GUI should be able to build it behind the scene
without the onus on the user.  For manipulating dynamic data in
transactional databases through direct positional access on the DBMS
side?  No.  Not possible at present in SQL and R-DBMS due to data
integrity concern.  Not likely desirable either for transactional
systems.

Proposals of data access through position info. require companion
tools and guidelines for ensuring data integrity, just like the
dealing of pointers in C and C++ programming languages.  Mere the
ability of data access through position info. is a guarantee for
disaster in data integrity.  Unless OO Base GUI is planning on
something different from the transactional R-DBMS market.

-- begin quote from Frank Schönheit

Now if the backend would expose the possibility for distinction to its
clients, all would be fine. (Which btw. leads us to the potential
driver capability of bookmarks: driver/backend-dependent unique row
values, which are not interpreted by the clients, only used for
identifying records when talking to the database.)

-- end quote from Frank Schönheit

Please let the community know which DBMS engine product has been
re-designed as suggested to allow direct data access through position
info.  Database design experts will weigh in on such design whether it
constitutes vulnerability or added value.  It should be interesting to
find out if those DBMS engine products gain or lose market share.

Sincerely,

Ray

Reply via email to