Hi.

I just want to thank everyone for their interest and advice on this thread.

As far as Postgre/MySQL goes, the reason I'm using mysql is most certainly ignorance. I work for a pretty small company, and the scope and requirements were very loosely defined when I began this project. They have evolved quit since then. I started with MySQL because it is was and is all I have ever used. Thing is, this the first database solution I've ever designed or written. It works really well and it does what its supposed to do. I've setup and indexed a bunch of columns that act as metadata for the column I'm actually interested in, and the indexes give me . For example as I parse the raw data into the database, I search for regular expressions that indicate an entry I'm about to insert are of interest to a particular client. Then you have:

mysql> select count(*) from malware_logs;
+----------+
| count(*) |
+----------+
| 21270325 |
+----------+
1 row in set (0.00 sec)

and also this:

mysql> select count(*) from malware_logs where ownership is not null and ownership regexp '%(client identifier)%';
+----------+
| count(*) |
+----------+
|     5812 |
+----------+
1 row in set (0.09 sec)

so I in .09 seconds i find the 5812 rows i care about out of 21270325. that's pretty impressive in my eyes, and up until I started having this problem today, I've been happy with the decision to use mysql. I guess we'll see what I think as this table continues to grow. Ideally it will be able to get as big as 50 - 60 million rows before I start 'rotating' the older entries off into archive tables.

Like I said, this is my first time writing a database that's not for, you know, some dumb web page of mine. It is also my first time using ORM. I absolutely love it, but I am afraid that I am using Rose::DB incorrectly in some way or other, which is the source of my problem, and my reason for writing to you guys.

On Jan 5, 2007, at 3:50 PM, John Siracusa wrote:
You're not wrong.  That should work fine.

The error looks to me like something in DBD::mysql itself, and it's failing
on $sth->execute(), before any rows are fetched:
[...]
Have you tried replacing the Manager call with a series of plain DBI calls using the same SQL? (Set $Rose::DB::Object::Manager::Debug = 1 before the
Manager call to see the SQL it's using.)

I think you are right on here, John. Consider:

my $sql = "SELECT * FROM malware_logs";
$Rose::DB::Object::Manager::Debug = 1;
my $iterator = MalwareLog::Manager->get_objects_iterator_from_sql($sql);

And low and behold, on STDIN: 'SELECT * FROM malware_logs'

If that is the statement being executed, I understand why its running out of memory. Like I said in my first letter, regardless of number of rows, this is a 12GB table. I think maybe I'm misunderstanding the proper use of get_objects_iterator_from_sql? I do want to select * so that all the columns are available in the statement handler that I'm assuming gets passed to the Rose::DB::Object constructor, or whatever its called in perl, but I obviously can't afford to select all of that at once.

Curiously, however, this fails with the same error:
my $sql = "SELECT id FROM malware_logs";
$Rose::DB::Object::Manager::Debug = 1;
my $array = MalwareLog::Manager->get_objects_iterator_from_sql($sql);

It seems like I should be able to fetch 23M+ ids without running out of memory.

Also, the very first thing I'd try is making a plain DBI script to select th last two rows from that table and run it from the command line. If that
doesn't work, then you know something deeper is wrong...

tried it, works!

What version of DBD::mysql are you running? Does it make any difference if you dump the table and recreate the table in question? Have you tried using
the InnoDB engine?

-John

I am using DBD::mysql 3.0006 from the bsdpan-DBD-mysql-3.0006 package, and DBI bsdpan-DBI-1.51. I have not tried dumping the table and reloading it, maybe I will try that next. Would you recommend recreating it as InnoDB? I have no reason to do that or not to do that, as far as I know, well, I suppose it would be nice to have foreign keys that were actually, you know, restraints. MyISAM was orignally chosen because I thought I'd need FULLTEXT support, but I don't.



On Jan 5, 2007, at 4:23 PM, Jonathan Vanasco wrote:

i think its neither rose::db or dbi -- i think its mysql.
check the comment at the bottom of this.
http://www.mysql.org/doc/refman/5.0/en/operating-system-error- codes.html
also
        http://dev.mysql.com/doc/refman/5.1/en/freebsd.html

That second link was particularly interesting. In my loader.conf I do have a 1G maxdsize, but I'm not setting any of those variables. I have:

kern.maxdsize="1024M"
console="comconsole"

which appears to be a typo for 'maxdsiz' :-)

/etc/defaults/loader.conf describes those 3 settings (from the second link) as follows:
#kern.maxdsiz=""                # Set the max data size
#kern.dfldsiz=""                # Set the initial data size limit
#kern.maxssiz=""                # Set the max stack size

I am going put those recommended values of 1G,1G,128M, reboot, and see what happens.

On Jan 5, 2007, at 5:26 PM, Perrin Harkins wrote:
The DBI drivers for most databases (and this includes Postgres and
Oracle) will load the entire result set into your local machine's RAM
even before you fetch the first row.  With MySQL and Oracle, you can
tell them not to do this. With Postgres, I think you have to use cursors.

For MySQL, just do this before issuing your large query:
$dbh->{'mysql_use_result'}=1;

You probably want to set it back afterward, since the default is better
for small queries (it frees up resources on the server faster).

- Perrin

Yes, as far as I can tell that is exactly what's happening! Where can I set this in the $dbh object managed by Rose::DB?

Thanks again everyone! I'll set those loader.conf variables, try re- loading the db as InnoDB and then let y'all know what happens. But really, is the call I'm making to get_object_iterators_from_sql the right way to go about doing what I'm doing?

-Neal

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Rose-db-object mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rose-db-object

Reply via email to