-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Feel free to redirect me to another list. I am only posting/writing
this here because y'all have been so helpful in the past, and the
guys that write Rose::DB are definitely database experts in my
opinion. Plus this is the only database list I am on.
I am a sysadmin who is good at perl, and then I got handed the role
of developer. I got a lot better at perl, but then I got handed the
role of DBA. I'm trying to be good at that. This project is a few
orders of magnitude over my head, but I like the challenge...and
being employed.
I just have a quick question about database design.
Based on the praise for postgresql from certain members of this
mailing list, my reading the postgre manual/(relevant parts of) Korry
Douglas' book, and my blind hope that postgre handles HUGE tables
better than mysql appears to, I am testing a switch of my product to
postgresql. Being a perl programmer, the way I think is a little bit
weird, and postgre's support for array's attracts me very much; I
want to use them for the sake of using them, but I'm not sure its the
best option for what I'm trying to do.
Disclaimer: this is not a Rose::DB related question (well, there is
one included below somewhere). Again, tell me to F-O if this is the
inappropriate place to ask...
So here's the scoop (slimmed down for relevancy):
table 1 'stuff':
data TEXT,
accounts INTEGER[]
date_added TIMESTAMP
table 2 'accounts':
id SERIAL PRIMARY KEY
name VARCHAR,
email BLAH
regex BLAH
etc. BLAH
if it were legal, stuff.accounts would REFERENCE accounts.id, but
apparently i cannot have an array of foreign keys. so it goes.
raw data is parsed (by a perl script, duh) into the 'stuff' table.
when it looks like something that a certain account would be
interested in (based on accounts.regex), then their id (accounts.id)
is included in the array for for that row (stuff.accounts). make sense?
the stuff table is huge. the actual requirements are "infinite rows
spanning an infinite amount of disk space," but i'm working with my
manager to limit those requirements just a bit :-) zero or more
clients will be interested in a row from the 'stuff' table, which is
why i am using an array. the whole point is for indexing. i index
that array, i can get quickly get rows back based on accounts.id's.
the way i am thinking, the queries to find 'stuff' rows that interest
an account will be:
SELECT * FROM stuff WHERE (1 = ANY(account))
where accounts.id = 1 for that particular account.
now, there will be hundreds of this account. is that a good way to go?
here's what i was doing before.
stuff.accounts was a TEXT field that was comprised of accounts.name's
enclosed in %-signs. i don't know much, but i think for some magical
reason the %-signs helped the mysql regex engine (there are no %
signs in account.name) anchor its search. so for an account with an
accounts.name of 'blah', the query was:
SELECT * FROM stuff WHERE (accounts IS NOT NULL AND accounts REGEXP
'blah')
and stuff.accounts might have looked like: '%blah%some_other_name%
whatever%'.
if this makes any sense and i don't sound like a crazy person who
should not be administering a database, my question for you guys
is... is the (1 = ANY(account)) call efficient? I am dealing with
tens of millions of rows here in the stuff table, and I need to get
data out of them _fast_. a lot of the rows will have no value for the
stuff.accounts array field.
which way seems better, the one where i am doing regular expression
matching based on a text field, or the one in which i am essentially
"grepping" through an array for an ID number that may or may not be
there?
- - -Neal
p.s. obligatory rose::db question so i don't feel like a jerk:
given then accounts INTEGER[] way of doing this, will rose db do this:
@accounts = $stuff->accounts;
and convert the ARRAY[id,id,id] into a perl array?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)
iD8DBQFFrctsecZZ/XfYgxIRAjrBAJ9kd24r3cIL4pfg74HxiC6BJQpCUQCfQnKX
CJU78TeC1/gsUePNpyjU1ik=
=556M
-----END PGP SIGNATURE-----
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Rose-db-object mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rose-db-object