Re: [documentation-dev] Re: Mid-level base tutorial

Andrew Jensen Tue, 02 Oct 2007 18:15:52 -0700

Howdy Mariano,

On 10/2/07, [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
> Hello Andrew:
>
> Thank you for the information on the lax vs. strict modes. I am going to
> drop the distinction in this tutorial and just work under the assumption
> that we are working on strict mode anyway.


good choice I think.

>
> Also, thanks for offering to post the ranges for the numeric data types. I
> also feel that, if any reason is needed, completeness is a good one and it
> will complement the tutorial very well.
>
> Your offer to point out common mistakes is very appreciated and also goes
> very well with the philosophy of the tutorial. I am waiting to read your
> comments so I can integrate them to it.
>

Alright - I'll get to that, but you don't want to hold up on it as I
am pressed for time and not sure how much I'll find for data mining in
old posts. I have a couple I know of, but with the last changes in the
Base UI they may have been mostly mitigated and I'll run a quick test
to see if that really is the case.


> I am pleased to say that I have finished drafting points 1 and 2 of the
> original outline. Only the sub-chapter on data types is waiting for some
> information. Andrew, I would like you to read the tables that I am
> annexing and fill in the gaps if you know the info (I am counting on that
> you do!).

If you meaning you attached something - the mail list server strips
those attachements so they would not come through to me - you would
have to send the mail directly to my email address.

Please note the distinction between: "stores exactly" and "
> stores up to". The first means that literal string of less than max length
> are padded with whites up to max length while the second means that Base
> respects the EOL and uses memory up to the actual length. If I am wrong,
> please let me know.

Yes, that sounds correct.

>
> Once I receive the information I will post my copy where you tell me so
> you can read it and give me your feedback. I know I have many options for
> posting (thank you all for your help). I will want to centralize it for
> now so there is never ambiguity on which is the current or latest version.
> My originals are in .odt format and include drawings and tables. Andrew,
> please chose an option that can display them. (Ah!, and instruct me on how
> to post, please!)

OK - the best way will most likely be in the storage area now afforded
us when we where granted the contributor status for the project.

@Frank, perhaps you could drop Mariano a quick note on how to do
that...see, that lets me avoid saying that I am not sure how to do
that either...*chuckle*

>
> I am also pleased to say that I have identified an example to follow in
> the third point of the outline that is rich in mental images, of the right
> level of complexity and generic enough so that readers should not have
> difficulty translating it to their own projects. I am currently developing
> it.

That I am anxiously awaiting to hear about - there has been a
discussion over the last few days on just this subject at the Base
mailing list, with a few strong opinions voiced by people. It will be
interesting to see what you chose.

>
> This is very exciting, as the content of the tutorial is already taking
> shape!
>
> Cheers,
>
> Mariano
>
> 1. Alphanumeric Type Variables: Used for storing alphanumeric characters
>
> Name    Data type               Max length      Description
> Memo    Long Var Char                   Stores up to the max length
accepts any UTF 8 character
max length being 2 Gig on a 32 big OS - it is really big
(+9223372036854775807) on a 64 bit OS ( currently available for OO.org
under Linux AMD64 and Win64 coming )

> Text(fix)Char                           Stores exactly the specified
>                                               length
accepts any UTF 8 character
Pads with trailing spaces for strings that are shorter then the defined size
Max length is identical to Long Var Char. By entering a value for the
length when the field is defined then this is used to limit the size
of the data you can request to store there.

> Text    Var Char                No limit (>1MB) Stores up to the specified
>                                               length
accepts any UTF 8 character
no padding of data with spaces
Max length same as Long Var Char,

> Text    Var Char Ignore Case            Stores up the the specified
>                                               length. It is not case
>                                               sensitive but stores
>                                               capitals as you type
>                                         them.
>
accepts any UTF 8 character
no padding of data with spaces
Max length same as Long Var Char,
It is the comparison functions that understand this as a special type
not the storage mechanism. To the storage mechanism it is simply
another Var Char.

> 2. Binary Type variables: Used for storing files like JPEGs, Mp3s, etc.
>
> Name    Data type       Max length      Description
> Image   Long Var Binary
> Binary  Var Binary
> Binary(fix)Binary               No limit (>1MB)
Here the max size is again 2 Gig for 32 bit  OS and that really big
number for 64 bit OS. The data can be thought of as an array of bytes,
no attempt to validate as UTF-8 or anything else is performed. For
example an image control can be placed on a form and it would be just
as happy storing or displaying an image into any of these field types.
In the case of Binary(fix) it is equivalent to the fixed character
field - By entering a value for the length when the field is defined
then this is used to limit the size of the data you can request to
store there.

The fact that there are these different representations of what is
essentially the same data types is really just to maintain
compatibility with the SQL standard, which in turn represents the
limitations ( differences ) manifest in different vendors RDBMS. At
least that is how I would explain it - others might look at it
differently.

>
> 3. Other Variable types: For storing small computer programs in the Java
> language:
>
> Name    Description
> Other   Stores Java objects in binary format
> Object  Same
>
Yes, that is true it does store java objects - specifically serialized
objects. In other words this data type has a standard way to declare
the executable code that would be used to move an object into storage
and back out again. This code is supplied by the program making use of
the engine, by declaring this type the database application developer
then knows exactly how to have their executable code utilized by the
engine. To use this type of field within a Base embedded database you
would have to declare the field using DML commands - there would be no
way to pass in the required 'hooks' via the GUI. So, again include
this for completeness but it is not a data type that would be used by
an intermediate user of the package.

One final point - HSQLdb also imposes a limit of 2 Gig on any single
table - so if one actually moved a 2 Gig image for instance into a
field - that would be it for the table. One field only, with one
record - in fact with overhead you would be some number of bytes short
of the 2 Gig limit..but hey who wants to that picky about this.

HSQLdb also places an 8 Gig limit on the overall database.

Now that said - Base with an embedded database can get no where near
those limits. In Base the entire database must exist in Memory - ok
with virtual memory that would be possible, but it would be unusable
performance wise as the OS would be thrashing like crazy with the swap
table. Unless you have a machine with 4 or 8 Gig of real memory.

Then there is one other issue. The internal buffer for result sets.
This buffer has a fixed size - in the embedded Base model I believe
that is 50 Meg. That buffer is utilized as a shared resource for all
open result sets. The size of each row becomes very important then for
the performance characteristics of the database, as the row size
increases the number of records ( rows ) that can be buffered
decreases causing more memory allocation to take place as one moves
through a result set. If you are designing a database with numerics
and relatively small char fields this is not a big problem, but as you
start to store large binary fields ( images, sound files, documents )
this has to be accounted for.

This is one of those common mistakes I was talking about. If you are
gong to be dealing with these large type objects it is important to
not include them in a table that may need to be scanned, as to do so
would require moving all of that information into the buffer for each
and every record. Instead it is better to create two tables, one with
the data that you might search against ( Names, Dates, Codes ) and the
create a second table that contains a key field, which is also a
foreign key to the master tables PK and the large object. Not doing so
will create a real problem with performance overall.

Anyway that is all I have for the moment. I hope that helps. I'll do
my best to get to that page on the wiki for the data type grid asap.

I really am looking forward to seeing those first 2 chapters and the
example database you have chosen. ( if you like you could always
attach that odt to an email and send it direct to me - no confusion
then, I'll read it and shred it..just like on Mission Impossible..)

Drew

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [documentation-dev] Re: Mid-level base tutorial

Reply via email to