Hello Phil,

Thanks for your patch, I will apply it for 11.0.4, some more comments are 
inline.

On 21.05.21 18:26, Phil Stracchino wrote:
Eric,

The following would be my first-cut patch.

The other way to handle restarting/resuming the copy operation when
updating to 11.0.2 would be instead of truncating the file_temp table if
it already exists, create it with its PRIMARY key already in place and
then use INSERT IGNORE instead of a simple INSERT.  This would ensure
that no record is copied twice and allow resuming the copy operation
where it left off.  (There would still be some loss of efficiency in
skipping records already copied, but at least they would not have to be
written to storage a second time.)

Yes, I'm not an expert here, if you have a good script to convert the catalog, 
I'm sure it will be appreciated.

However, there is a non-obvious schema issue that this does not address.
  There are many places where we are currently using TINYBLOB types to
contain data as small as md5 checksums.  This is a common, but bad,
idea.  It is bad because the MySQL MEMORY storage engine, used for
explicit and implicit temporary tables, does not support the BLOB/TEXT
types, and so any temporary table that contains any column of a BLOB or
TEXT type will be forced to disk, with obvious performance impact.


This is the kind of explanation I was looking for, thanks.

Clearly there are places where the size of the data is unknown and
potentially large, and there we have little alternative but to use an
appropriate BLOB type.  But where we want to store binary data of a
known and manageable maximum length, we should be using VARBINARY
instead for performance reasons.

I have already in the past converted many of the TINYBLOB columns in my
Bacula catalog schema to VARBINARY with no ill effects.  I now need to
redo a few of them because we just rewrote the File table.  :)

 From my most recent nightly DB backup, these are the columns I currently
have converted to VARBINARY:

minbar:root:/dbdumps/minbar-20210521-04:55:25 # zgrep -i varbinary
bacula*schema.sql.gz
bacula.Client-schema.sql.gz:  `Name` varbinary(64) NOT NULL,

A directive name can be up to 127 bytes, and I'm looking to extend it at 256 or 
even more in a short term.

A VolumeName, MediaType, etc.. have the same possible length.

I also need to re-convert File.MD5 to VARBINARY(32) and, now, change
File.Filename to VARBINARY(255).

What is the maximum possible size of File.LStat?  That is another good
candidate to become a VARBINARY.

At this time, we can store up to a SHA512 in base64, I would suggest to do a 
test.

Thanks for your patch, this is very much appreciated.

Best Regards,

Eric




_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to