On 1/31/06, Nilson Santos Figueiredo Junior <[EMAIL PROTECTED]> wrote: > I was trying to perform some fairly simple benchmarks comparing > Rose::DB::Object, Class::DBI and an in-house solution and came up with > some disappointing results from Rose::DB::Object. > > I'd like to know if there's something I did wrong or that could be > done in a better way since RoseDB was only marginally faster (about > 40%) than Class::DBI (vanilla, without prefetch, nothing).
Depending on what you're testing, 40% isn't necessarily "disappointing." Let's look at what you're doing. First, the class definitions look okay. Just a quick tip: > # Create a private registry for this class > __PACKAGE__->registry(Rose::DB::Registry->new); That's better written as: __PACKAGE__->use_private_registry; Now the actual benchmarks: > sub RDB { > for my $cidade (@{CidadeRDB::Manager->get_cidade(query=>[nome=> { > like => '%a%' },])}) { > my $x = $cidade->id; > my $y = $cidade->nome; > my $z = $cidade->estado_obj; > > my $var = $cidade->estado_obj->pais_obj->nome; > } > } > > sub CDBI { > for my $cidade (CidadeDBI->search_like({nome =>'%a%'})) { > my $x = $cidade->id; > my $y = $cidade->nome; > my $z = $cidade->estado; > > my $var = $cidade->estado->pais->nome; > > } > } For these two simple tasks, let's think about what's being tested. The database is the same for each module, obviously. But the generated SQL is slightly different. RDBO will try to fetch all columns at once: SELECT id, name, estado FROM cidade WHERE name LIKE '%a$' whereas your CDBI class will only fetch the primary key: SELECT id WHERE name LIKE '%a$' Later, when you do this: my $x = $cidade->id; my $y = $cidade->nome; CDBI will go back to the database to fetch those values. To change this, make all of the columns "essential" in your CDBI class: __PACKAGE__->columns(Essential => qw(id nome estado)); Alternately, you could make all of the columns "lazy" in your RDBO class. Then it will mimic the CDBI behavior. The point is to do an apples-to-apples comparison. Another option is to try to make each module go as fast as possible. Your benchmark as it existed did neither. As for the RDBO example, there are a few things you can do to make it go faster. RDBO Manager methods accept a Rose::DB object parameter through which they access the database. If you do not supply one, a new one will be created during each call, and then discarded at the end. By not passing a db object, you're forcing get_cidade() to reconnect to the database each time. Since the database connection is class data in CDBI, it persists between calls in the CDBI example. In any "real" application using many repeated RDBO Manager calls, you'd create a single db object and then pass it to each call. In terms of the benchmark, it'd look like this: my $db = RDBSetup->new; sub RDB { for my $cidade (@{CidadeRDB::Manager->get_cidade(db => $db, query => [ nome => { like => '%a%' } ])}) { ... } Incorporating both these changes, now the SQL is the same and both modules reuse a single database connection. Now what's actually being tested? Basically, it boils down to: 1. SQL generation - How fast can the SQL query be generated based on the "name like ..." abstract parameters? 2. Object instantiation - How fast can the objects be created from the row data? The first item can be sped up a bit in RDBO by passing the "query_is_sql" parameter with a true value. CidadeRDB::Manager->get_cidade(db => $db, query_is_sql => 1, query => [ ... ]); This parameter tells the Manager that there's no need to try to parse and reformat the query arguments. They are already formatted correctly for the database. This is true of the lone query parameter, '%a%', so it's safe to include that flag. (An example of where it wouldn't be safe is a query that includes a "casually formatted" date string or a DateTime object or something else that has to be parsed and then reformatted for the current database.) As for the second thing being tested, object instantiation, that depends on how many rows are actually being returned. If no rows are returned, then it isn't a factor at all and you're just testing SQL generation speed. If thousands of rows are returned, then the SQL generation speed will fade away into the noise and you're just testing object instantiation speed. Finally, we come to this part: # RDBO my $var = $cidade->estado_obj->pais_obj->nome; # CDBI my $var = $cidade->estado->pais->nome; That is, fetching a related object through a foreign key. As written, your benchmark hits the database once for each row in order to fetch the related object. This is an apples-to-apples comparison (or rather, will be once you make all the columns "Essential" in that CDBI class, or lazy in that RDBO class; it's the same issue as before). But any real code written using an RDBO Manager would almost certainly opt to fetch that related object in a single query rather than making one extra query per row. To do this, use the require_objects parameter: CidadeRDB::Manager->get_cidade(db => $db, query_is_sql => 1, require_objects => [ 'estado_obj' ], query => [ ... ]); The resulting SQL would be something like: SELECT t1.id, t1.name, t1.estado, t2.id, t2.nome, t2.pais FROM cidade t1, estado t2 WHERE t1.estado = t2.id AND t1.name LIKE '%a$' Since this feature is not present in Class::DBI, the comparison is now "unfair" from one perspective. OTOH, artificially tying RDBO's hands behind its back will not lead to very representative benchmark results. But it all depends on what you're trying to test. (There are further speed improvements to be had in the RDBO example by making a custom Manager method with a pre-generated SQL query, but that's only worthwhile for queries that you plan to run a lot.) To sum up, it's important decide what it is you really want to test. Making good benchmarks is hard. I did my best to isolate the performance of several common tasks in the benchmark suite that is bundled with Rose::DB::Object. (The script is located at: t/benchmarks/bench.pl) I tried to make each module go as fast as possible while accomplishing the same task, since that's what people will do in the real world. If you want to pursue your own benchmarks, consider the changes I recommended above. But also please take a long look at the source code and classes for the bench.pl script before you decide that you need to test something that isn't represented there. -John ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 _______________________________________________ Rose-db-object mailing list Rose-db-object@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rose-db-object