On 23/02/2011 18:17, Yaron Koren wrote:
> Hi,
>
> I agree that it's great to hear from Wikia, and also great to know that
> Wikia is willing to put in some development time and effort to help with
> SMW. A few thoughts:
>
> - Wikia has already contributed somewhat to improving performance - I've
> been talking for a while to Tim Quievryn (who was at the Boston SMWCon
> last year), and his feedback helped lead to the faster handling of red
> links in Semantic Forms that was added in version 2.0.8.
>
> - Semantic Drilldown might actually be contributing to DB writes - it
> creates a temporary database table on every hit to Special:BrowseData.
> (I don't know if temporary tables get counted.)

Oh. This should be checked.

>
> - This might not be the right place to discuss the specifics of the "SMW
> light" initiative, but it's my personal belief that the best approach to
> it is to do the triple-store integration, [1] so that SMW can use an RDF
> triple-store directly to store its data, rather than trying to improve
> or limit SMW's queries. It would theoretically speed up queries, but,
> more importantly, even if it didn't, it would basically eliminate SMW's
> impact on the wiki's database. That's just my personal opinion, though -
> I'm not involved in either of those projects.

Yes, this is also what I had in mind. I have been planning the RDF store 
integration for some time now but did not yet manage to really do the 
necessary adjustments. This effort is still related to the SMW Light 
initiative since the SMW Light SQL backend would be used (it is much 
simpler than the SMW backend which has to do all the queries). I think 
in this combination a significant performance gain will be possible, 
since one can also streamline the DB writing in this simple store. But 
you are right that SMW Light is really not about improving performance 
but about reducing code size (and quite a lot of the functionality, too).

I need to do quite some more integration work to make sure that all 
current features of SMW are fully suported via an RDF store backend. I 
have a good concept of what I need to do, but I did not find the time 
yet to actually do it. (But explaining it to someone else in sufficient 
detail would probably take just as long, and I would still need to 
review the code.)

Another interesting option with RDF store integration would be to use an 
RDF-based faceted browser instead of Semantic Drilldown (if it is 
actually found to cause problems on such large/high-traffic wikis).

- Markus


> 2011/2/23 Markus Krötzsch <mar...@semantic-mediawiki.org
> <mailto:mar...@semantic-mediawiki.org>>
>
>     [Making this into a new thread]
>
>     Hi Krzysztof,
>
>     I was already wondering when I would hear from Wikia ...
>
>     As you have noticed, running SMW and extensions on large sites (large in
>     terms of content, or in terms of users) has special requirements.
>     Typically, we suggest to use more conservative settings for querying, so
>     that long and difficult queries do not occur. Similarly, some SMW
>     extensions have not been developed for large sites, and can be
>     problematic in their own right. But your users obviously want to keep
>     the features that they already have, so we need to find better ways of
>     addressing your problem.
>
>     But first we need to separate concerns a little bit. You mention the
>     following distinct problems:
>
>     (1) Too many DB writes (about 60% in total)
>
>     (2) Too many slow queries (about 90% from SMW)
>
>     Moreover, your problem is not caused by SMW alone but by a number of
>     SMW-related extensions. So there will be multiple issues that need
>     addressing to fix this, and maybe even in multiple extensions.
>
>     Let us first see how big the impact of the extensions you mention could
>     be. Semantic Forms mainly leads to some additional reads (apparently no
>     problem for you); the total number could possibly be reduced. It may
>     also have some effect on query activity if certain autocompletion
>     features are used. But otherwise I think it is unlikely to be the root
>     of the problem. Semantic Drilldown might be more of a problem regarding
>     complex queries. But it uses its own SQL queries, so it should be
>     possible to find out how much of (2) comes from this extension. Semantic
>     Drilldown should not contribute to (1).
>
>     Are there any other extensions that use SMW on your site?
>
>     Regarding SMW, I have some concrete ideas on what could be done for (1)
>     and (2) but this will need more careful consideration first. I am
>     grateful if you can help to track down the cause of the problem, but I
>     am afraid that the changes in SMW core will still need to be done or at
>     least reviewed carefully by myself -- which makes me kind of a
>     bottleneck for the SMW part of your problem. I need to think about the
>     required work a little further before I can promise anything.
>
>     Regards,
>
>     Markus
>
>
>     On 22/02/2011 22:38, Krzysztof Krzyżaniak wrote:
>      > I think it's would be right place to jump in.
>      >
>      > Hello, my name is Krzysztof Krzyżaniak a.k.a. eloy and I work for
>     Wikia
>      > Inc as backend team leader. We are probably (correct me if I am
>     wrong)
>      > on of the biggest user of Semantic Mediawiki suite. We currently have
>      > enabled it on about 100 wikis for example on
>     familypedia.wikia.com <http://familypedia.wikia.com> or
>      > yugioh.wikia.com <http://yugioh.wikia.com> or www.wowwiki.com
>     <http://www.wowwiki.com> (but also on wikis which you
>      > probably don't suspect for SMW interest like glee.wikia.com
>     <http://glee.wikia.com> or
>      > madmen.wikia.com <http://madmen.wikia.com>). We would like to
>     expand existence of SMW on Wikia
>      > (for example lyrics would love it) but currently we cannot afford it
>      > because of performance reasons. For example, our first cluster
>     contains
>      > about 30.000 wikis, mostly biggest ones. About 60% of writes in
>      > databases came from SMW extensions (SemanticMediawiki,
>      > SemanticDrilldown, SemanticForms), also about 90% queries from
>     slow logs
>      > are from SMW.
>      >
>      > I am here to find a way for scaling SMW on our wikis. But also I
>     think
>      > that it will be benefit for every SMW user because we want to help
>      > improve SMW.
>      >
>      > What you can expect:
>      > - "real world" cases, actually lot of them :)
>      > - bugs :) (filled in bugzilla of course)
>      > - bug fixes and patches (either as diff or direct svn commits if you
>      > prefer that way)
>      > - questions
>      >
>      > We can offer engineering hours and testbeds.
>      >
>      > For a start I have question for Roadmap: SMW light - how complete
>     it is?
>      > What's missing? When you expect it will be ready? How can we help?
>      >
>      >     eloy
>      >
>
>     
> ------------------------------------------------------------------------------
>     Free Software Download: Index, Search & Analyze Logs and other IT
>     data in
>     Real-Time with Splunk. Collect, index and harness all the fast
>     moving IT data
>     generated by your applications, servers and devices whether
>     physical, virtual
>     or in the cloud. Deliver compliance at lower cost and gain new business
>     insights. http://p.sf.net/sfu/splunk-dev2dev
>     _______________________________________________
>     Semediawiki-devel mailing list
>     Semediawiki-devel@lists.sourceforge.net
>     <mailto:Semediawiki-devel@lists.sourceforge.net>
>     https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>
>
>
>
> --
> WikiWorks · MediaWiki Consulting · http://wikiworks.com


------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to