AW: BLOB and CLOB Torque datatypes

2005-05-10 Thread Thomas Fischer




Hi,

Fabio Insaccanebbia [EMAIL PROTECTED] schrieb am 09.05.2005
18:40:47:

 Hi Jörg,

 Particularly the Oracle JDBC drivers used to give a lot of problems
 when trying to use BLOBs.
 
 yes, that's true.. however the last driver version (10g) seems to have
 solved a couple of problems with BLOBs..
 The choice of avoiding BLOBS in an application is probably a good one:
 Torque, on the other side, should try
 to solve or reduce the problems with BLOB using so that the
 application developers can choose to use BLOBs
 without too many issues.

Here, I agree with Fabio. In my opinion, what the user wants is to store a
byte array or a large string in a database, and then looks for a data type
which does this. The user will maybe decide to use a clob and a blob for
this. This may limit the user to certain databases, but maybe this might
not be an issue in his case. Or maybe the user decides to go for maximum
portability, in which case he will use some other data type. But I do not
think that limiting Torque to the subset of features supported by all
databases is a good idea.


 With regard to mapping, there is another problem with date, datetime and
 timestamp, particularly with newer versions of MySQL which have a very
 peculiar handling of these types. Even though there is no DATETIME data
type
 in the JDBC interface we may have to provide the possibility to define
such
 in the schema.xml file, since some DBMS use it and clearly distinguish
 between timestamp, date, and datetime.
 
 The JDBC interface is probably a bit clumsy... the setTimestamp seems
 the only way to set both Date and Time in the same field.
 I'm afraid that some driver creators decided to overcome this JDBC
 limit by ignoring the setDate specified behaviour.

 I agree with the DATETIME proposal: this could give us the flexibility
 to work around driver's strange behaviours.

Could you please elaborate a bit more on this ? I do not have a SQL
specification ready, so I list what I guess. Please correct me if I'm
wrong.
The SQL Date type is supposed to hold dates with day accuracy.
The SQL Time type is supposed to hold time information with second
accuracy. It should not hold any Date information.
The SQL Timestamp type is supposed to hold Date+Time with at least
millisecond accuracy.

So I guess that the datetime type should hold date+time information with
second accuracy (like the date type in oracle).
Do you want to implement this for all databases ? This would probably mean
that for some databases, you would have to use the timestamp type and limit
its accuracy. But then, defining a date with day accuracy on oracle also
needs limiting the accuracy to dates.

I am not sure whether this can be implemented with village

  Thomas


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



db layout guidelines

2005-05-10 Thread Thomas Fischer
Hi,
From some discussion on the user list, I thought that there should be a 
guideline regarding the layout of a database. This is not meant as a 
course in relational databases, just some things that one should do or 
should not do when using Torque. I have appended my first draft. Comments, 
clarifications, addons, suggestions are very welcome

  Thomas
 body
  p
   This Howto is NOT meant as a complete guide of how to design a 
database.
   On the contrary, it is meant as a kind of addon, in the sense of
   qout;If I know in general how to design a databasein general, what do 
I have to
   keep in mind if I use Torque to access that databasequot;
  /p

  section name=Things not supported by Torque
   subsection name=Column names
p
 There are some column names which you can not use in Torque although
 your database would support them. The reason is that they would
 produce constants twice in the generated code. The column names which
 cannot be used are (case is ignored)
 ul
  liTABLE_NAME/li
  liDATABASE_NAME/li
 /ul
/p
p
 Furthermore, it is recommended that you do not use words which have
 a defined meaning in SQL as column names. Even if you can trick your
 database into using them, it is not sure whether Torque can do the 
same.
 And besides, even if it works for one database, if you ever decide
 to use another database, you may run into trouble then.
/p

   /subsection
  /section
  section name=Design considerations
   p
These design considerations apply to the most common uses of a
database.
In some cases where you have to meet very special challanges, there
will
be reasons not to follow the advice given here. But in general,
you will be fine if you follow these guidelines.
   /p
   subsection name=primary keys
p
 For every table, you should create a primary key which has no meaning
 in real life. The reasons for this are:
/p
p
 You should use a primary key at all because it creates a well-defined
 link between the objects in the database and the objects in memory.
 Often, one has to decide whether a java object in memory describes
 quot;the samequot object as a certain row in the database. For 
example,
 if you read an object from a database, change a field value and
 write it again to the database, you would usually want to update the
 row you read from. This is only possible if Torque can find the row
 in the database from which the object originated. For this, the
 primary key is used in relational databases. If two java objects have
 the same primary key, they describe quot;the same thingquot; and 
refer
 to the same row in the database. If you do not have
 a primary key, there is no well-defined way to decide if two java
 objects describe quot;the same thingquot;. You might run into not
 being able to update an object.
/p

p
 Now that we know why we want to have a primary key at all, why should
 it have no meaning in real life ? This can be explained best by an
 example. Consider a table which holds manufactured parts. Each part
 has an unique serial number. So it is tempting to use the serial
 number
 as a primary key. But now imagine that we have registered the wrong
 serial number for a certain part in the database. Remember that
 the primary key is used to decide quot;is it the same object?quot;
 So we cannot change the serial number of a specified object without
 making it another object.
/p
p
 In Torque, this problem manifests itself in that there is no easy way
 to change the primary key of an object; you must trick Torque into it
 by using Torque's internals. This should be avoided if possible.
 If you use an additional primary key which
 has no meaning in real life, you do not run into that trouble.
/p
   /subsection
  /section
 /body
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


cvs commit: db-torque/src/rttest/org/apache/torque DataTest.java

2005-05-10 Thread tfischer
tfischer2005/05/10 12:18:08

  Modified:src/rttest/org/apache/torque DataTest.java
  Log:
  Added a simple testcase for reading and writing LOBS
  Appended a check to a testcase that newly generated objects are filled with 
autogenerated ids
  
  Revision  ChangesPath
  1.19  +58 -1 db-torque/src/rttest/org/apache/torque/DataTest.java
  
  Index: DataTest.java
  ===
  RCS file: /home/cvs/db-torque/src/rttest/org/apache/torque/DataTest.java,v
  retrieving revision 1.18
  retrieving revision 1.19
  diff -u -r1.18 -r1.19
  --- DataTest.java 8 May 2005 17:27:25 -   1.18
  +++ DataTest.java 10 May 2005 19:18:08 -  1.19
  @@ -18,6 +18,7 @@
   
   import java.text.DateFormat;
   import java.text.SimpleDateFormat;
  +import java.util.Arrays;
   import java.util.Date;
   import java.util.HashMap;
   import java.util.Iterator;
  @@ -37,6 +38,8 @@
   import org.apache.torque.test.IntegerPk;
   import org.apache.torque.test.LargePk;
   import org.apache.torque.test.LargePkPeer;
  +import org.apache.torque.test.LobTest;
  +import org.apache.torque.test.LobTestPeer;
   import org.apache.torque.test.MultiPk;
   import org.apache.torque.test.MultiPkForeignKey;
   import org.apache.torque.test.MultiPkPeer;
  @@ -87,6 +90,8 @@
   Author author = new Author();
   author.setName(Author  + i);
   author.save();
  +assertTrue(authorId should not be 0 after insert,
  +author.getAuthorId() != 0);
   
   for (int j = 1; j = 10; j++)
   {
  @@ -850,6 +855,58 @@
   a.setName(has Single ' Quote);
   a.save();
   }
  +
  +public void testLobs() throws Exception
  +{
  +// clean LobTest table
  +{
  +Criteria criteria = new Criteria();
  +criteria.add(
  +LobTestPeer.ID, 
  +(Long) null, 
  +Criteria.NOT_EQUAL);
  +LobTestPeer.doDelete(criteria);
  +}
  +
  +// create a new LobTest Object with large blob and clob values
  +// and save it
  +LobTest lobTest = new LobTest();
  +{
  +int length = 10;
  +byte[] bytes = new byte[length];
  +StringBuffer chars = new StringBuffer();
  +String charTemplate = 1234567890abcdefghijklmnopqrstuvwxyz;
  +for (int i = 0; i  length; ++i)
  +{
  + bytes[i] = new Integer(i % 256).byteValue();
  +chars.append(charTemplate.charAt(i % charTemplate.length()));
  +}
  +lobTest.setBlobValue(bytes);
  +lobTest.setClobValue(chars.toString());
  +}
  +lobTest.save();
  +
  +// read the LobTests from the database
  +// and check the values against the original values
  +List lobTestList = LobTestPeer.doSelect(new Criteria());
  +assertTrue(lobTests should contain 1 object but contains  
  ++ lobTestList.size(),
  +lobTestList.size() == 1);
  +
  +LobTest readLobTest = (LobTest) lobTestList.get(0);
  +boolean bytesDiffer = false;
  +
  +assertTrue(read and written blobs should be equal. 
  ++ Size of read blob is
  ++ readLobTest.getBlobValue().length
  ++  size of written blob is 
  ++ lobTest.getBlobValue().length, 
  +Arrays.equals(
  +lobTest.getBlobValue(),
  +readLobTest.getBlobValue()));
  +assertTrue(read and written clobs should be equal, 
  +lobTest.getClobValue().equals(readLobTest.getClobValue()));
  +}
   
   
   /**
  
  
  

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



AW: BLOB and CLOB Torque datatypes

2005-05-10 Thread Joerg Friedrich
Hi Thomas,

I didn't want to suggest abandoning BLOBs or CLOBs. I just wanted to point
out some problems with these data types we have had in the past, and which
were the reason to not use them in our project. These were not necessarily
Torque problems, the biggest problem we had was with the way the Oracle JDBC
driver worked. We support several database system (currently MySQL,
Firebird/Interbase, MS SQL Server, HSQLD, Oracle and PostgresSQL), so we
have the problem of keeping the code base together.

With regard to DATE, TIME, and TIMESTAMP in SQL92 (I am not an expert on
this):

DATE   in format -MM-DD
TIME   in format HH:MM.SS. (seconds precision is implementation defined)
TIMESTAMP in format -MM-DD HH:MM.SS. (seconds precision is
implementation defined)
TIME(p) or TIMESTAMP(p): format with seconds to p digits precision

and then there are 

TIME(p) WITH TIME ZONE   (not supported by most RDBMS)
TIMESTAMP(p) WITH TIME ZONE  (not supported by most RDBMS)

time zone specified as +HH:MM,... or -HH.MM,...

So far for the theory. In most cases if you don't care about the last byte
of storage space TIMESTAMP should cut it for you. But...

don't forget the range! For example MySQL doesn't permit dates in TIMESTAMP
fields to go below 1970 and beyond 2037 (the year 2038 problem is already
preprogrammed). Instead, MySQL has a DATETIME field type. Too bad you can't
find it in the SQL92 specification. DATETIME is also popular in MS SQL
Server.

In the current Torque structure, if you need date and time you can only
specify TIMESTAMP in the schema definition file. This can be mapped to
TIMESTAMP for most database systems, but not for MySQL. In MySQL this should
be mapped to DATETIME.

If we specifiy a DATE or TIME field in the schema definition file, we have a
problem with the current adapter function: it knows of only one method for
date/time related fields. I suggested a little while ago to extend this to
support DATE, TIME and TIMESTAMP.

Unfortunately we do not get a lot of help from the Java side to find out
which method is required: a Date always includes time information.

Here I have to bail out, since I haven't looked into the inner workings of
Village and Torque.

My short term suggestion: always use TIMESTAMP, stick with the current one
adapter function but rigidly using the timestamp format, and change the
MySQL mapping of TIMESTAMP to DATETIME. This should support 99% of all
current applications. This would leave a problem only with TIME fields,
which cannot be handled by TIMESTAMP fields.

Joerg




Hi,

Fabio Insaccanebbia [EMAIL PROTECTED] schrieb am 09.05.2005
18:40:47:

 Hi Jörg,

 Particularly the Oracle JDBC drivers used to give a lot of problems
 when trying to use BLOBs.
 
 yes, that's true.. however the last driver version (10g) seems to have 
 solved a couple of problems with BLOBs.. The choice of avoiding 
 BLOBS in an application is probably a good one: Torque, on the other 
 side, should try to solve or reduce the problems with BLOB using 
 so that the application developers can choose to use BLOBs
 without too many issues.

Here, I agree with Fabio. In my opinion, what the user wants is to store a
byte array or a large string in a database, and then looks for a data type
which does this. The user will maybe decide to use a clob and a blob for
this. This may limit the user to certain databases, but maybe this might not
be an issue in his case. Or maybe the user decides to go for maximum
portability, in which case he will use some other data type. But I do not
think that limiting Torque to the subset of features supported by all
databases is a good idea.


 With regard to mapping, there is another problem with date, datetime 
 and timestamp, particularly with newer versions of MySQL which have a 
 very peculiar handling of these types. Even though there is no 
 DATETIME data
type
 in the JDBC interface we may have to provide the possibility to 
 define
such
 in the schema.xml file, since some DBMS use it and clearly 
 distinguish between timestamp, date, and datetime.
 
 The JDBC interface is probably a bit clumsy... the setTimestamp seems 
 the only way to set both Date and Time in the same field. I'm afraid 
 that some driver creators decided to overcome this JDBC limit by 
 ignoring the setDate specified behaviour.

 I agree with the DATETIME proposal: this could give us the flexibility 
 to work around driver's strange behaviours.

Could you please elaborate a bit more on this ? I do not have a SQL
specification ready, so I list what I guess. Please correct me if I'm wrong.
The SQL Date type is supposed to hold dates with day accuracy. The SQL Time
type is supposed to hold time information with second accuracy. It should
not hold any Date information. The SQL Timestamp type is supposed to hold
Date+Time with at least millisecond accuracy.

So I guess that the datetime type should hold date+time information with
second accuracy (like the date type in