Hi, I am working on IMMS [http://www.luminal.org/wiki/index.php/IMMS/IMMS], an adaptive playlist plugin for XMMS. It uses SQLite as the backend to store all the metadata.
The algorithms IMMS uses for song correlation are in essence graph algorithms. The graph currently is stored in the database as an adjacency matrix (3 columns: origin, destination, link weight). The most resource intensive algorithm has to do the following. Given a graph of form A / P--Q--B \ C And points P and Q, it tries to create links between P and the outer graph of Q. So in the example above it would create/update links P--A, P--B, and P--C. This would be easy to do, except the weight of the newly created links depends on the weight of the current link between Q and node in question, as well as the weight of the link from the node to P. So right now, I do something like (this is a simplified example): (Have to create new table here, otherwise the table will be locked, and I will not be able to run updates from the callback) CREATE TEMP TABLE 'Temp' AS SELECT * FROM 'Correlations' WHERE origin = 'Q' AND destination != 'P'; SELECT * FROM 'Temp'; And then the callback gets called it gets 3 parameters, lets call them ORIGIN, DESTINATION and WEIGHT, and I have to do something horrible like: SELECT count(weight) FROM 'Correlations' WHERE origin = ORIGIN AND destination = DESTINATION; and then if it returns 0, I insert the new value INSERT INTO 'Correlations' ('origin', 'destination', 'weight') "VALUES (ORIGIN, DESTINATION, WEIGHT / MAX_WEIGHT); (where MAX_WEIGHT is just some constant) and if the select query returned something other than 0, I add to the existing link UPDATE 'Correlations' SET weight = weight + WEIGHT / MAX_WEIGHT WHERE origin = ORIGIN AND destination = DESTINATION; Needless to say, this is very slow. I was wondering if anybody has any ideas on how I can optimize this. My knowledge of SQL is not too in depth, so I might be missing something. The main problem (I think) is the way the adjacency matrix is stored. Because there are two nodes, I cannot really use a UNIQUE constraint to be able to simply do something like INSERT OR REPLACE. And even if I say concatenated origin and destination (like make "2", "5" into "2|5" or something), and enforced uniqueness on that, I don't know how to make a single query that will add to the link or insert it if it does not exist.... So I do not see a way to compress this to a single update statement (which would be significantly more efficient because there would be no need call the callback, or even create the temp table), even if I was willing to sacrifice some functionality to do that. For example I could maybe live without using the weight of the links between Q and its external nodes, and just always update links between P and the nodes by a constant value.... But because of the problem above, I don't see a good way to do even that. Has anyone done any graph-algorithms-in-an-sql database type stuff? Any ideas for a better way to store the graph in SQLite to make the updates faster and/or the single query update possible? Any suggestions would be greatly appreciated. Thanks in advance. -- Have fun, I tried to no avail Michael "mag" Grigoriev To keep my eyes from growing pale [EMAIL PROTECTED] But my vision came too late http://www.luminal.org And my belief started to abate
pgp00000.pgp
Description: PGP signature