[Rails-core] Tickets #4433 and #4769 - AssociationProxy stuff

2006-04-18 Thread Josh Susser


Regarding:  http://dev.rubyonrails.org/ticket/4433

I created a small patch for this ticket that makes an association's  
owner and reflection available to code in extensions.  I'd like to  
get some feedback on the approach I took and some indication as to  
whether this might be accepted into the trunk at some point.  While  
it's not critical for my work, it would sure make some things a lot  
simpler.


Also:  http://dev.rubyonrails.org/ticket/4769

This is related to #4433.  Someone on the email list ran into this  
problem, and while I was working on #4433 I kept wondering about the  
issue.  When building a proxy wrapper class you don't want to block  
any methods on the target object that people might want to use.   
Unfortunately that's what's happening here.  It might be useful to  
consider what to do about this in conjunction with #4433 so that we  
can get some consistency.


"You can be consistent, or you can be inconsistent, but you can't be  
both." -- Kent Beck, 1988


--
Josh Susser
http://blog.hasmanythrough.com


___
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core


[Rails-core] ActiveRecord , Looking For Feedback

2006-04-18 Thread zdennis
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I have been working on a temporary tables plugin/extension for ActiveRecord 
since early January 2006 as I have needed to support
hundreds of thousands of rows being inserted at a time every few hours. I 
mentioned it on the list and got a few private emails
which has led me to believe other people have this problem.

As time has progressed I have extended AR to support faster and more 
operations. I am a MySQL user so these only apply to MySQL. I
know that DHH has made public comments about trying to stay implementation 
agnostic when it comes to the different databases that
AR supports, so perhaps my plugin is best as a plugin only. Thoughts?

I want to ensure that the API interface is easy and friendly. I don't want to 
change interfaces for each update, i know how much
of a pain in the arse that can be.

I'd like to do the following things in this email:
 - describe issues I have with AR
 - describe the API I have used to solve them
 - ask for feedback

Here are the issues I've had with AR, the ones I've set out to fix.
 1 - to slow for large insertions of data, due to creating too many objects in 
ruby
 2 - no easy support for multiple value insertions via one statemente. This has 
two drawbacks:
  a. more then needed data is sending over the mysql client/server 
connection
  b. mysql has to go through sending query and parsing query for each 
insert statement.
 This is to much overhead for large insertions of data

 3 - no in memory temporary table support
 4 - no fast ability copy data from one table to another
 5 - no ability to enable/disable foreign keys
 6 - no ability to drop primary keys (yes there are times when this is needed)

Here is the API I have implemented to solve these.

   # Create a temporary in memory table off from an existing model
   temp_model = MyTable.create_temporary_table
   # temp_model can also be referenced as TempMyTable

   # Create a temporary in memory table off from an existing model and specify 
it's name
   temp_model = MyTable.create_temporary_table :name=>'MyTemporaryTable'
   # temp_model can also be referenced as MyTemporaryTable

   # drop temporary models
   temp_model.drop
   MyTemporaryTable.drop

   # redefine columns no the temporary model
   temp_model.redefine_column :id=> 'id int default 0',
 :field1=>'new_field varchar(255)'

   # drop primary keys on primary tables
   temp_model.drop_primary_key

   # enable/disable foreign key checks on ActiveRecord::Base
   ActiveRecord::Base.foreign_key_checks :disable
   ActiveRecord::Base.foreign_key_checks :enable

   # support streamlined insert operations, using subarrays
   # which map the a column identifying array
   columns =  [ :id, :column1, :column2 ],
   values =  [ [1,'value'], [2, 'value'], [3,'value'] ]
   MyModel.insert columns, values

   # copy data from one table into another
   options = { :on_duplicate_key_update=>{ :column1 => :column2 }
   MyModel.insert_into MyTemporaryTable, options

This increases performance 1000x when working with large datasets. Now granted, 
you aren't going to use this for someone browsing
your web site who fill out a webform and click submit, it isn't intended for 
that audience. It is intended for people who are
using (or want to use) RoR for large internet or even intranet driven corporate 
sites, which have to be able to handle mass
updates of data several times a day.

I currently use RoR on a few small web sites, and some internal webapps, but I 
spend 90% of my time with RoR with AR optimizing
it's speed.

Any thoughts, comments or suggestions would be greatly appreciated. All of this 
was done with ActiveRecord 1.13.2, so some of this
may have been addressed in AR 1.14.2, if so please let me down gently. ;)

Thanks,

Zach







-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFERWyAMyx0fW1d8G0RAvLpAJ41vFNDUvDlVH6QpJuZ9hby9ACQ/QCfcexY
SG9z0x3TK+iBhpNBlYZof+s=
=dxFr
-END PGP SIGNATURE-
___
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core


Re: [Rails-core] ActiveRecord , Looking For Feedback

2006-04-18 Thread zdennis
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

zdennis wrote:

> 
> I currently use RoR on a few small web sites, and some internal webapps, but 
> I spend 90% of my time with RoR with AR optimizing
> it's speed.
> 

I forgot to add, the 90% of my time with AR is because of a huge intranet 
driven corporate system, which includes web sites, java
based thick clients and a kick ass backend using ruby, rails, ruby-libxml 
bindings. The company is Fortune 50... so you can
imagine the # of people we are supporting and why speed and efficiency is a 
concern for me.

Zach
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFERXCNMyx0fW1d8G0RAun0AJ459OFHyHMqIV6ndhEgD4udF+oTugCeORhR
vmzdzseMiB6mJpI2/Fn4ais=
=KvFX
-END PGP SIGNATURE-
___
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core


Re: [Rails-core] ActiveRecord , Looking For Feedback

2006-04-18 Thread Michael Koziarski
> > I currently use RoR on a few small web sites, and some internal webapps, 
> > but I spend 90% of my time with RoR with AR optimizing
> > it's speed.

Are you only noticing this speed problem with big batch operations
like you mentioned in the earlier email?

When working with an ORM, it's fairly common to do heavy data imports,
batch loads and the like using the database's built in tools, what
makes you want to use AR instead of tools like mysqldump etc?


--
Cheers

Koz
___
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core


Re: [Rails-core] ActiveRecord , Looking For Feedback

2006-04-18 Thread zdennis
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Michael Koziarski wrote:
>>>I currently use RoR on a few small web sites, and some internal webapps, but 
>>>I spend 90% of my time with RoR with AR optimizing
>>>it's speed.
> 
> 
> Are you only noticing this speed problem with big batch operations
> like you mentioned in the earlier email?

The bigger the batch, the more time it takes, so it's easier to notice big 
speed issues with 500,000 records, rather then 50
records. Alot of what rails (and AR) give the user in terms of flexibility and 
ease of use comes with a huge performance penalty.
I am willing to sacrifice niceties for better performance.

> When working with an ORM, it's fairly common to do heavy data imports,
> batch loads and the like using the database's built in tools, what
> makes you want to use AR instead of tools like mysqldump etc?
> 

There is more to my story though, but I didn't want to rant on to long in my 
last email. Myself and a colleague of mine, Mark Van
Holstyn, came up with a idea for an IO Mode plugin for AR. It works 
beautifully. You can turn it off and on, on the fly. It is
great for debugging rails SQL commands if you don't wan't to parse log files. I 
use my AR plugin in combination with this AR IO
mode plugin to generate all of my SQL statements and I have them written to 
disk. I then batch load them via a mysql command line
client.

In one instance every few hours I receive an update from another system. Some 
updates are very small, a few kb in size, other
updates it's 15Mb to 40Mb of straight data. It doesn't come formatted in SQL 
statements, so I have to parse the data myself.

It is faster for me to deal with my AR plugin API, and let AR do all of the 
hard work for me, generating statements, quoting, etc.
 plus I get all of the already builtin AR functionality for free should I need 
to extend my work.

It is also easy to maintain, since we use AR, ruby and the rest of rails for 
alot of this system on the backend, we worry about
ruby coding and working withing ruby-based method APIs. We also get a huge code 
base to use, which is nice and easy to extend.

I am working on limiting this performance gap for mysql though, to where things 
don't need to be batch loaded outside of a ruby
instance. This is limited though due to how ruby's GC work, if you want to 
continuous run a daemon that does these huge data
operations, but you never restart the daemon.

Another problem I face is the max_allowed_packet size issue. One of my next 
todo's is allow the user to configure whether or not
they want it dynamically resized per that session, or if they want the AR 
plugin to split up large queries into the least # of
queries to support the current max_allowed_packet size.

And to answer your questions after 6 paragraphs, =) , I use mysqldump and other 
tools where it makes sense. But I need a way to
get from data_format_a to data_format_sql, and I prefer ruby in that regard 
with AR at the moment.

Zach
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFERXuXMyx0fW1d8G0RAuKPAJ920A+gWxpX4+UhTZgKIZgw2RamWQCdFnMX
qWMGmY9yxxs4QdqzIczCP6g=
=m1Bi
-END PGP SIGNATURE-
___
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core


Re: [Rails-core] ActiveRecord , Looking For Feedback

2006-04-18 Thread Jeff Lindsay
Your IO Mode plugin sounds interesting too... care to share it? That
and this bulk AR/virtual table both sound useful to me. Release them
as plugins? : )

-jeff

On 4/18/06, zdennis <[EMAIL PROTECTED]> wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Michael Koziarski wrote:
> >>>I currently use RoR on a few small web sites, and some internal webapps, 
> >>>but I spend 90% of my time with RoR with AR optimizing
> >>>it's speed.
> >
> >
> > Are you only noticing this speed problem with big batch operations
> > like you mentioned in the earlier email?
>
> The bigger the batch, the more time it takes, so it's easier to notice big 
> speed issues with 500,000 records, rather then 50
> records. Alot of what rails (and AR) give the user in terms of flexibility 
> and ease of use comes with a huge performance penalty.
> I am willing to sacrifice niceties for better performance.
>
> > When working with an ORM, it's fairly common to do heavy data imports,
> > batch loads and the like using the database's built in tools, what
> > makes you want to use AR instead of tools like mysqldump etc?
> >
>
> There is more to my story though, but I didn't want to rant on to long in my 
> last email. Myself and a colleague of mine, Mark Van
> Holstyn, came up with a idea for an IO Mode plugin for AR. It works 
> beautifully. You can turn it off and on, on the fly. It is
> great for debugging rails SQL commands if you don't wan't to parse log files. 
> I use my AR plugin in combination with this AR IO
> mode plugin to generate all of my SQL statements and I have them written to 
> disk. I then batch load them via a mysql command line
> client.
>
> In one instance every few hours I receive an update from another system. Some 
> updates are very small, a few kb in size, other
> updates it's 15Mb to 40Mb of straight data. It doesn't come formatted in SQL 
> statements, so I have to parse the data myself.
>
> It is faster for me to deal with my AR plugin API, and let AR do all of the 
> hard work for me, generating statements, quoting, etc.
>  plus I get all of the already builtin AR functionality for free should I 
> need to extend my work.
>
> It is also easy to maintain, since we use AR, ruby and the rest of rails for 
> alot of this system on the backend, we worry about
> ruby coding and working withing ruby-based method APIs. We also get a huge 
> code base to use, which is nice and easy to extend.
>
> I am working on limiting this performance gap for mysql though, to where 
> things don't need to be batch loaded outside of a ruby
> instance. This is limited though due to how ruby's GC work, if you want to 
> continuous run a daemon that does these huge data
> operations, but you never restart the daemon.
>
> Another problem I face is the max_allowed_packet size issue. One of my next 
> todo's is allow the user to configure whether or not
> they want it dynamically resized per that session, or if they want the AR 
> plugin to split up large queries into the least # of
> queries to support the current max_allowed_packet size.
>
> And to answer your questions after 6 paragraphs, =) , I use mysqldump and 
> other tools where it makes sense. But I need a way to
> get from data_format_a to data_format_sql, and I prefer ruby in that regard 
> with AR at the moment.
>
> Zach
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.1 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
> iD8DBQFERXuXMyx0fW1d8G0RAuKPAJ920A+gWxpX4+UhTZgKIZgw2RamWQCdFnMX
> qWMGmY9yxxs4QdqzIczCP6g=
> =m1Bi
> -END PGP SIGNATURE-
> ___
> Rails-core mailing list
> Rails-core@lists.rubyonrails.org
> http://lists.rubyonrails.org/mailman/listinfo/rails-core
>


--
Jeff Lindsay
http://blogrium.com/
___
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core