14.01.2012 18:33, James Hong Kong пишет:
> Hi,
>
> Starting with SMW 1.7 and MW 1.18, we began to convert our old legacy
> document system into a SMW-MW based system which right now left us
> with more than 700.00 triplets stored in SMW but at the same time
> decreased our response time on SMW-related queries.
>
> Somewhere around 200.000 triplets (it does not mean the number is a
> threshold)  we recognized an increased impact on query performance
> where now every time we execute a query we feel the pinch. We are not
> talking about in-template query performance as seen by the
> Wikia/Familypedia example (we abandoned such practices some time ago).
> Nowadays we encourage users to execute all complex queries either via
> Special:Ask or provide an input form to run a RunQuery and yes we are
> using APC to improve caching and response time in general.
>
> We tried to look at external solutions such as 4Store which is not
> supported on Windows, Virtuoso has no real documentation available to
> make it work with SMW (at least we couldn't find one), and Jena which
> seems to require SMW+ leaving us with the native SMW store itself and
> we would like to keep it that way as every external software means an
> additional fault point and maintenances effort.
>
> == Architectural question ==
>
> #1 Could their be an indexing problem on behalf of one of the primary
> SMW table key indexes?
>
> # 2 Does SMW natively support MySQL internal
> query-cache-type/query-cache-size option to improve query performance?
> We made sure MySQL is using query-cache-type/query-cache-size option
> but somehow this don't show any effect for SMW-related queries.
>
> #3 Would a different approach to handle query data namely storing
> query data in a temporary in-memory table bring advantages compared to
> the current approach of accessing SMW disk tables every-time a query
> is executed? Would an in-memory concept for queried data (SMW data is
> mirrored into a temporary in-memory table for READ purpose only at the
> time of the actual MySQL session and every time MySQL is restarted
> temporary in-memory tables have to been rebuild) improve query and
> access performance of SMW related triplets. I guess (I don't know)
> neither MyISM or InnoDB would do have an impact since the bottleneck
> seems the disk access to execute queries on behalf of triplets stored
> in SMW-related tables.
>
> Of course their is always a way to improve performance by using better
> hardware (RAID, SSD to improve output performance) but this a last
> resort approach which we would like to avoid for the moment.
>
> System:
> MediaWiki     1.18.0, PHP 5.3.8 (apache2handler), MySQL 5.5.16, APC
> version       3.1.6-dev
>
> PS: Our increased use of triplets comes from an automatic indexing
> process of content and document transfer which exchanges information
> with Sphinx Search while identifying the 30 most used words in a
> document which is written back to the wiki and stored as semantic
> triplet on the related NS_IMAGE object.
>
> Cheers,
>
> mwjames
>
>    
I think proper anwer would be move to Linux and use 4store, although 
some people recently complained that queries on internal objects do not 
work correctly with 4store.
BTW, if you have gigabit LAN or faster (fiber) you may try setting up 
4store at different host in your LAN, while keeping SMW in Windows.
Dmitriy

------------------------------------------------------------------------------
RSA(R) Conference 2012
Mar 27 - Feb 2
Save $400 by Jan. 27
Register now!
http://p.sf.net/sfu/rsa-sfdev2dev2
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to