Hi, Starting with SMW 1.7 and MW 1.18, we began to convert our old legacy document system into a SMW-MW based system which right now left us with more than 700.00 triplets stored in SMW but at the same time decreased our response time on SMW-related queries.
Somewhere around 200.000 triplets (it does not mean the number is a threshold) we recognized an increased impact on query performance where now every time we execute a query we feel the pinch. We are not talking about in-template query performance as seen by the Wikia/Familypedia example (we abandoned such practices some time ago). Nowadays we encourage users to execute all complex queries either via Special:Ask or provide an input form to run a RunQuery and yes we are using APC to improve caching and response time in general. We tried to look at external solutions such as 4Store which is not supported on Windows, Virtuoso has no real documentation available to make it work with SMW (at least we couldn't find one), and Jena which seems to require SMW+ leaving us with the native SMW store itself and we would like to keep it that way as every external software means an additional fault point and maintenances effort. == Architectural question == #1 Could their be an indexing problem on behalf of one of the primary SMW table key indexes? # 2 Does SMW natively support MySQL internal query-cache-type/query-cache-size option to improve query performance? We made sure MySQL is using query-cache-type/query-cache-size option but somehow this don't show any effect for SMW-related queries. #3 Would a different approach to handle query data namely storing query data in a temporary in-memory table bring advantages compared to the current approach of accessing SMW disk tables every-time a query is executed? Would an in-memory concept for queried data (SMW data is mirrored into a temporary in-memory table for READ purpose only at the time of the actual MySQL session and every time MySQL is restarted temporary in-memory tables have to been rebuild) improve query and access performance of SMW related triplets. I guess (I don't know) neither MyISM or InnoDB would do have an impact since the bottleneck seems the disk access to execute queries on behalf of triplets stored in SMW-related tables. Of course their is always a way to improve performance by using better hardware (RAID, SSD to improve output performance) but this a last resort approach which we would like to avoid for the moment. System: MediaWiki 1.18.0, PHP 5.3.8 (apache2handler), MySQL 5.5.16, APC version 3.1.6-dev PS: Our increased use of triplets comes from an automatic indexing process of content and document transfer which exchanges information with Sphinx Search while identifying the 30 most used words in a document which is written back to the wiki and stored as semantic triplet on the related NS_IMAGE object. Cheers, mwjames ------------------------------------------------------------------------------ RSA(R) Conference 2012 Mar 27 - Feb 2 Save $400 by Jan. 27 Register now! http://p.sf.net/sfu/rsa-sfdev2dev2 _______________________________________________ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel