Hi Charley, very interesting indeed your little MV query engine, how have you implemented it? In other words what is the underlying data store?
1) Is it an RDBMS? 2) Is it a file or multiple files in the FS? If so what is the format of those files, XML? In the case of 1) do you use the LIKE statement to apply pattern matching rules such as us.% (<=> us.*)? If not, how do you make it happen minimizing joins over multiple tables (that is the problem of Tony's approach IMO)? In case of the second, do you maintain indexes for fast string matching of megabytes (at least) of info? If so, how do you maintain them? If not then do you use BTrees or something similar? >coverage.location = 'us.ma.springfield' >coverage.country = 'us' >coverage.state = 'ma' >coverage.city = 'springfield' Yes, I also thought about implementing some syntactic sugar over SQL queries. Instead of having: (A) SELECT * FROM Company WHERE MVP_GEO_COUNTRY='us' AND MV_GEO_COUNTRYREGION = 'ma' MVP_GEO_CITY= 'springfield" We could allow queries such as: (B) SELECT * FROM Company WHERE GEO = "us.ma.springfield' (not sql at all) OR (C) SELECT * FROM Company WHERE GEO = "us; ma; springfield' (not sql at all) (Dublin Core Structured Values (DCSV) scheme) OR (D) SELECT * FROM Company WHERE GEO = "COUNTRY:us; COUNTRY.COUNTRYREGION:ma; COUNTRY.COUNTRYREGION.CITY:springfield;" (using Dublin Core Structured Values (DCSV) scheme) The best way to see the implication is analyzing the problem bottom up: ((D) <=> (C) <=> (B)) => (A) In other words (D), (C), (B) would of course be converted to (A) by an SQL preprocessor before sending the query to the RDBMS (more overhead). "SELECTS" requiring multiple un-linked tables could be mapped into a UNION (uff, really slow). But for now, we are not planning to support this syntactic sugar mainly because users will define queries schemas using GUI and with a UI steamed from faceted classification that basically does more or less the same. Our goal is to configure the system without the need for coding including query schemas (search interfaces). Best regards, Nuno Lopes PS: Multi values in DC are called Structured Values -- http://cms-list.org/ more signal, less noise.