-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I have been working on a temporary tables plugin/extension for ActiveRecord
since early January 2006 as I have needed to support
hundreds of thousands of rows being inserted at a time every few hours. I
mentioned it on the list and got a few private emails
which has led me to believe other people have this problem.
As time has progressed I have extended AR to support faster and more
operations. I am a MySQL user so these only apply to MySQL. I
know that DHH has made public comments about trying to stay implementation
agnostic when it comes to the different databases that
AR supports, so perhaps my plugin is best as a plugin only. Thoughts?
I want to ensure that the API interface is easy and friendly. I don't want to
change interfaces for each update, i know how much
of a pain in the arse that can be.
I'd like to do the following things in this email:
- describe issues I have with AR
- describe the API I have used to solve them
- ask for feedback
Here are the issues I've had with AR, the ones I've set out to fix.
1 - to slow for large insertions of data, due to creating too many objects in
ruby
2 - no easy support for multiple value insertions via one statemente. This has
two drawbacks:
a. more then needed data is sending over the mysql client/server
connection
b. mysql has to go through sending query and parsing query for each
insert statement.
This is to much overhead for large insertions of data
3 - no in memory temporary table support
4 - no fast ability copy data from one table to another
5 - no ability to enable/disable foreign keys
6 - no ability to drop primary keys (yes there are times when this is needed)
Here is the API I have implemented to solve these.
# Create a temporary in memory table off from an existing model
temp_model = MyTable.create_temporary_table
# temp_model can also be referenced as TempMyTable
# Create a temporary in memory table off from an existing model and specify
it's name
temp_model = MyTable.create_temporary_table :name=>'MyTemporaryTable'
# temp_model can also be referenced as MyTemporaryTable
# drop temporary models
temp_model.drop
MyTemporaryTable.drop
# redefine columns no the temporary model
temp_model.redefine_column :id=> 'id int default 0',
:field1=>'new_field varchar(255)'
# drop primary keys on primary tables
temp_model.drop_primary_key
# enable/disable foreign key checks on ActiveRecord::Base
ActiveRecord::Base.foreign_key_checks :disable
ActiveRecord::Base.foreign_key_checks :enable
# support streamlined insert operations, using subarrays
# which map the a column identifying array
columns = [ :id, :column1, :column2 ],
values = [ [1,'value'], [2, 'value'], [3,'value'] ]
MyModel.insert columns, values
# copy data from one table into another
options = { :on_duplicate_key_update=>{ :column1 => :column2 }
MyModel.insert_into MyTemporaryTable, options
This increases performance 1000x when working with large datasets. Now granted,
you aren't going to use this for someone browsing
your web site who fill out a webform and click submit, it isn't intended for
that audience. It is intended for people who are
using (or want to use) RoR for large internet or even intranet driven corporate
sites, which have to be able to handle mass
updates of data several times a day.
I currently use RoR on a few small web sites, and some internal webapps, but I
spend 90% of my time with RoR with AR optimizing
it's speed.
Any thoughts, comments or suggestions would be greatly appreciated. All of this
was done with ActiveRecord 1.13.2, so some of this
may have been addressed in AR 1.14.2, if so please let me down gently. ;)
Thanks,
Zach
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFERWyAMyx0fW1d8G0RAvLpAJ41vFNDUvDlVH6QpJuZ9hby9ACQ/QCfcexY
SG9z0x3TK+iBhpNBlYZof+s=
=dxFr
-----END PGP SIGNATURE-----
_______________________________________________
Rails-core mailing list
[email protected]
http://lists.rubyonrails.org/mailman/listinfo/rails-core