So I'm jumping back into database stuff after a long absence.
(I guess you could say that I was never really *in* databases
to begin with. Anyway...) I'm working with various databases
and have settled on MySQL but am encountering a variety of
problems that I hope the MySQL community can help me solve.
First, a comment. In dealing with a variety of databases, I
was surprised to find the apparent lack of direct support for
international characters. So I've basically resolved to insert
all of my international text as UTF-8 encoded BLOBs. Is this
the general approach people use for this problem?
Second, I'm programming in Java and want to use JDBC. So I
downloaded mm.MySQL but was dismayed to find that it doesn't
support prepared statements. Is there any particular reason
why?
I wanted to modify the source for mm.MySQL to convert to and
from UTF-8 streams when calling PreparedStatement#setString
and ResultSet#getString so that I can write my program as if
the underlying database *did* support international chars.
But without support for prepared statements, I don't know if
I want to go through the trouble.
Next, I was wondering why multiple selects don't work in
MySQL. For example: "SELECT * FROM a WHERE value IN (SELECT
column FROM b);".
Last, I have a particular problem related to SQL that I
want to solve. This is not particularly related to MySQL
but I am using MySQL as the database so any solution would
have to be within the limits of what MySQL supports.
Here's the idea: I have a table for text content that may
be translated into various languages. Then I have a table
for user preferences that specify which languages the user
wants (and in what order). Lastly I have a mapping table
from language codes to their default language code. Make
sense so far? If not, here is the description of the three
tables:
mysql> describe content;
+-------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+----------------+
| id | int(11) | | PRI | NULL | auto_increment |
| resid | int(11) | | | 0 | |
| lang | varchar(5) | | | | |
| text | blob | | | | |
+-------+------------+------+-----+---------+----------------+
mysql> describe prefs;
+----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| user | varchar(50) | | | | |
| lang | varchar(5) | | | | |
| priority | tinyint(4) | | | 1 | |
+----------+-------------+------+-----+---------+-------+
mysql> describe langs;
+---------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| code | char(5) | | PRI | | |
| defcode | char(5) | | | | |
+---------+---------+------+-----+---------+-------+
The last table, "langs", is used so that if a user's pref
is to read US English (en-us) and there is no content in
that language *but* there is one in Default English (en),
then they are returned that resource.
Here's a sampling of the data in these three tables:
mysql> select * from content;
+----+-------+-------+----------+
| id | resid | lang | text |
+----+-------+-------+----------+
| 1 | 1 | en-us | Color |
| 2 | 1 | en-gb | Colour |
| 3 | 1 | ja | IRO |
| 4 | 2 | en | Mountain |
| 5 | 2 | ja | YAMA |
+----+-------+-------+----------+
mysql> select * from prefs;
+-------+-------+----------+
| user | lang | priority |
+-------+-------+----------+
| andyc | en-us | 1 |
| andyc | ja | 2 |
+-------+-------+----------+
mysql> select * from langs;
+-------+---------+
| code | defcode |
+-------+---------+
| en | en-us |
| en-us | en-us |
| en-gb | en-us |
| ja | ja |
+-------+---------+
As can be seen from the "prefs" table, I (being "andyc")
prefer to retrieve content in US English but can also read
Japanese. Now if I only wanted the content that matched my
number one priority, then it would be easy. However... I
want to retrieve the content from my number one preference
*and* also know that the resource is available in my other
language preference(s).
The following select statement returns the information that
I want but there are a few problems. First, it returns all
of the information which is wasteful since I only want to
display the content from the first hit. Second, I'm sure
that crossing three tables is *not* a good idea in general.
But like I said, I'm kinda new to SQL so I don't know too
much about JOINs, etc. Anyway, here's the query and the
result:
mysql> select c.resid,c.lang,c.text,p.priority
-> from content as c,prefs as p,langs as l
-> where p.user='andyc' and
-> (c.lang=p.lang or (c.lang=l.code and p.lang=l.defcode))
-> group by c.resid,p.lang
-> order by resid,p.priority;
+-------+-------+----------+----------+
| resid | lang | text | priority |
+-------+-------+----------+----------+
| 1 | en-us | Color | 1 |
| 1 | ja | IRO | 2 |
| 2 | en | Mountain | 1 |
| 2 | ja | YAMA | 2 |
+-------+-------+----------+----------+
The first thing I'd like to know is what is the "right" way
of setting up my tables and then performing this kind of
query? The next problem is more a matter of programming...
Since returning all of the content seems pretty wasteful,
I'm thinking of using two select statements: one to return
the list of matches with priority 1 and another to return
the entire list of matched resources. But if I do this then
I don't know how many langauges match each resource. It
would be a lot easier if the original query could return
the number of matched languages as well.
I was thinking that the count() method could work but I
don't think I'm using it right. If I try to add a column
with "count(*)" or "count(c.resid)" or "count(p.lang)" I
get the following result, for example:
mysql> select c.resid,c.lang,c.text,p.priority,count(*)
-> from content as c,prefs as p,langs as l
-> where p.user='andyc' and p.priority=1 and
-> (c.lang=p.lang or (c.lang=l.code and p.lang=l.defcode))
-> group by c.resid,p.lang
-> order by resid,p.priority;
+-------+-------+----------+----------+----------+
| resid | lang | text | priority | count(*) |
+-------+-------+----------+----------+----------+
| 1 | en-us | Color | 1 | 8 |
| 2 | en | Mountain | 1 | 1 |
+-------+-------+----------+----------+----------+
I would like to have the count of the matched languages
per resid to be in the last column. Is this possible
with MySQL? with SQL in general? Do I need to restructure
my tables to be able to do this?
I'm sorry for the long-winded mail but I figured it was
best to try to solve as many problems as possible with
the fewest posts. :)
--
Andy Clark * [EMAIL PROTECTED]
---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)
To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php