Re[2]: More Innobase questions (Was: RE: Innobase)

Heikki Tuuri Mon, 26 Feb 2001 06:05:28 -0800
Hi!

There were more questions about Innobase:

>Here is the comming question - can you speak some more about apace
>allocation consepts used in innobase. For example how would it like to
>work with huge number (10.000+ of the tables) and how space allocation

The data dictionary is hashed, a huge number of tables is no problem.
Space allocation is done for small tables one database page at a time
(default 16 kB). For big tables space is allocated in chunks of
consecutive pages, up to 256 pages big.

>is done - I know innobase supports clusterisation, but if I'm not

Yes, all tables in Innobase are clustered by their primary key, that is,
rows are physically in the order of their primary key. If you do not
specify a primary key, then a generated row id is internally used as
the primary key: the rows will be ordered according to their insertion
sequence. The support for the Oracle-like clustering, physically mixing
several tables, is built in but I do not have a user interface for it.

>mistaken only for rows from one table. The next thing is is there a
>difference made in handling dynamic and static rows and how
>fragmentation is avoided for last ones ?

You probably mean what happens if the size of a row changes?
Innobase uses no overflow pages. If a row is updated so that it
cannot fit on the page, we split the B-tree page to get more space.
This will of course slow down table scans. Recreating tables where
this happens may be sensible from time to time.

>As far as I know some space allocation problems are qute different in
>file systems and in databases, so I'm surprising you're speaking about
>similar algorithm.

Yes, in file systems you cannot grow a file in the middle, only at the
end.

>Other thing is - does innobase supports hash indexes internally ?

Yes, it automatically creates hash indexes, no user configuration is
needed.

>Other thing - can I not add files but increase it size ?   How
>innobase will handle if I'll do so ?

Currently, you cannot increase a size of a data file, but you have
to add a new. Increasing the size of the last data file would of course
be technically easy to implement.

>Few words about creation. As I looked at innobase files they are
>mostly zero filled while created. I think it would be nice to add a
>possibility to create files not by write() but by lseek() this will
>make creation faster and files taking less space on file system which
>support holes in files - last one is good as you can just create huge
>files in the very begining and do not care much about wasting thespace.

The problem is that the OS may fragment the files if they are not
physically allocated.

>>D) can you safely add new files when there is data in them already?
HT> Yes you can, but you have to shut down the database first.
>Shurely special command may be added for this later :)

Yes, in Oracle you can add a data file while the database is up
(ALTER TABLESPACE ADD DATAFILE), and it is easily implementable.

>Other thing is - don't you think it's not quite good idea to store
>database configuration in config file. For now it's quite simple and
>can be easyly recovered by looking at the files anf their sizes but
>then you will have more complex configuration (i.e several tablespaces
>with external mirroring) it will be hard to recover.

You have to keep backups of .cnf files also.

>Other question - files sizes. Does innobase itself support 4GB+ files?

The support is built in, but not tested yet.

>>   4) Currently ATIS test fails with innobase table:> Retrieving data
>> Warning: Query 'select 
>> city.city_name,state.state_name,city.city_code from state,city 
>> where city.state_code=state.state_code' returned 1 rows when it 
>> should have returned 11 rows> Got error:  when executing select 
>> flight.flight_code,aircraft.aircraft_type from flight,aircraft 
>> where flight.aircraft_code=aircraft.aircraft_code> got 0 instead of 579 ***> 
HT> You may have an outdated version of the MySQL source distribution.
HT> (I am currently exchanging lots of emails with Peter and looking at the
HT> problem.)
>I've currently patched a version but still get the same error. Other
>tests (i.e big tables) now passes.

The Linux version runs the test without errors on my machine. Maybe
the source repository you are using is still somewhat outdated.

HT> physically writes its data files full at database creation, so that
HT> the space is physically occupied.
>Yes of couse. But I think this should be mentioned in the manual so
>users not surprised mysqld is doing something several minutes after
>startup before it's ready.

Will be added to the manual, and to the message the database prints when
a new data file is created.

>Do you have plans for external RAID support I.E stripping and
>mirroring. At leas in row device configuration I thing this may give
>better perfomance then OS/Hardware RAID.

No, I have no plans for simulating RAID in the database server,
it is done on a lower level, like RAID disk hardware. We have
to measure the speed of hardware RAID systems, to find out if
there is difference to individual disks.

HT> For best performance, you should create big files, so that disk
HT> seeks are minimized.
HT> If you lose a file, you must do an archive recovery from a backup,
HT> using archived log files of MySQL, (not log files of Innobase).
>This is a problem. Currently mysql does not have a feature for online
>backup (planed for 4.0) - so to make a real consistent snapshot for my
>my mysql tables (20GB per server) I need to lock system for about half
>an hour which is not acceptable in 7/24 mode.

The online backup cabability is built in in Innobase.
You could take an online backup from Innobase by switching the
database to the online backup mode and just physically copying the
database files, and also generated log files, just like in Oracle.
But the problem currently is that the backup is not in sync with
tables stored in non-Innobase formats, like MyISAM.

>Other thing is log files - it may take qute long time to process then
>(additional problems there is only one log file stream) in my case I
>tries to recover with 3 day logs and it took about 3 days :)

One could sort Innobase archived log files and do a fast recovery in a
single sweep of the disk, but again we have the sync problem with
other table formats.

HT> I will have to write CHECK TABLE (in Oracle called ANALYZE). I am not
HT> sure if repair table is a good idea, Oracle DBA manual at least does
HT> not mention such a procedure for Oracle. If the database gets
HT> corrupted, then the safest method is to recover it from a backup which
HT> is old enough so that corruption does not occur in the backup. Or recover
HT> the tables from dumps and MySQL archived log files.
>That's bad idea to look at the oracle in this place. Look not only at
>Hi End but in low end also. In my case with mysql most dammages are
>only affwct indexes so recoverty is possible even without data lost.
>Other thing is not everyone have complete backup (data + whole
>logfiles) so I think that's not quite a good idea to skip recovery
>tool. It's like with filesystems - you usually do backup but I don't
>think you would like if you will not be able to fix you file system if
>it dammaged.

How do MySQL indexes get corrupted? does it happen when the
database crashes? Since in Innobase there is recovery, a database
crash usually does not corrupt data or index files. I will have to look
at other mainstream databases like MS SQL Server and DB2, if they have
repair tools.

If the data in a table is not corrupted, you can try to fix the
indexes by dropping them and recreating them. But I still think this
is somewhat risky.

But I would not say Oracle is a 'high-end' database, something different
from 'low-end' databases. I mean that all well written relational
databases are technically rather similar. Databases of all sizes
need about the same technical infrastructure in the database server
code.

>The problem here is like in file system you can't just check one file
>on file system you have to check the whole file system as there could
>be cross links or bitmap  errors. So You'll probably will have to
>divide  the physical corect state (allocation and such stuff) which is
>done on file level and logically correct (i.e index matches all rows),
>there the last one can be fixed on table level. Making the difference
>could help as it would allow to recover one tables  while others areonline.
>Other thing - do you have any flags like MYISAM has - like crashed andso on ?

In Oracle you can recover from logs a single data file, while others are
online. This would help a bit. Do you mean the flags that the MySQL check
procedure sets on files?

>>  8) As I understand innobase works with large files self mannaging
>>  usage in them - so Is it planned to support RAW devices with innobase
HT> Yes, Innobase will support raw disks. I have to test it still.

>Good. This should really give much perfomance.-- Best regards,

I am not sure if using raw disks helps a lot. I measured the speed difference
for a disk scan on one machine, and I think it was less than 5 %. But the
other benefit is that the OS cannot fragment a raw disk.

Thank you for your comments!

>Peter                            mailto:[EMAIL PROTECTED]

Heikki


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re[2]: More Innobase questions (Was: RE: Innobase)

Reply via email to