Hello Danilo, against GoogleBot trying all these fancy (linked) actions, I'd suggest you make use of Robots.txt. We've made almost all actions away from robots using it: http://www.curriki.org/robots.txt Of course, you can also write apache rewrite rules… this is finer grained (even checking the identity of the client).
On the solr random queries, I am a bit surprised your scenario works… what random value would you take? Is that random only for sorting and you use the default query (*:*)? I guess that would work (it's not a random query then, it's a random ordering, something you don't want the users to intentionally formulate I think). paul On 26 août 2014, at 23:33, Danilo Oliveira <[email protected]> wrote: > Hello Clemens, > > I have checked the XWIKILIST and I noticed that my genre, country and > language lists of the movies, that are defined in my movieClass, are > recorded in this table. Do you think that is the cause of the slowness? > > But I discovered who is generating these queries, the GoogleBot see: > 66.249.69.197 - - [26/Aug/2014:18:07:09 -0300] "GET > /bin/view/Main/Tags?do=viewTag&tag=tang-breakfast-drink HTTP/ > There are googlebots requisitions trying to delete my tags too... > Well. I am blocking them according this doc[0] > > > Rodrigues, > I accessed the neo4j site. This db looks like very interesting and I think > that is applicable to my application. However my app is in proof of concept > phase, so actually XWiki attend my necessities. Absolutely, I will consider > it if my application grows. thanks for the tip! > > Well, I changed my queries to SOLR and now my application is working > perfectly, even better than at the beginning. > > But I have just one more necessity. Random Query. > > I checked in SOLR how to make a random query and I found this article [2] > > And on the "Additional Configuration" section of the article, you can read > that we need the two parameters below in the schema configuration. However > In schema.xml of XWiki, we just have the first [1] > > <fieldType name="random" class="solr.RandomSortField" indexed="true" /> > <dynamicField name="random_*" type="random" /> > > I am not expert on SOLR, but if I just add the second parameter will it > work or Do I need to worry about other things? > > [0]http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Performances > [1] > https://github.com/xwiki-contrib/xwiki-platform-solr/blob/master/solr/conf/schema.xml > [2] > http://solr.pl/en/2013/04/02/random-documents-from-result-set-giveaway-results/ > > Thanks everyone for the attention! > > Danilo > > > > 2014-08-26 4:33 GMT-03:00 Clemens Klein-Robbenhaar < > [email protected]>: > >> >> This Query looks much like it is generated by the tag service when >> searching >> for documents with a given tag (the code is in class TagQueryUtils, method >> getDocumentsWithTag, in the >> xwiki-platform-core/xwiki-platform-tag/xwiki-platform-tag-api >> module) >> >> This query might be triggered by any kind of UI element (Panel, macro etc.) >> I do not think it is used to update any search index or the like. >> Instead it is used on some pages, e.g. Main.Tags when clicking on a tag to >> see its list >> of documents. >> >> I wonder why this query takes so long. Even a 100K docs should not be >> that much >> (I mean, 5 minutes query time, huh?) Is there any chance some binary data >> of the movie >> objects or the like ended up in the xwikilistitems table or any other >> table used in the query? >> >> Clemens >> >>> Hello, >>> >>> As I mentioned, I discovered that the queries that are hogging my DB are >>> similar to: >>> '102', 'xwiki', 'localhost:52614', 'xwiki', 'Query', '372', 'Creating >> sort >>> index', 'select xwikidocum0_.XWD_FULLNAME as col_0_0_ from xwikidoc >>> xwikidocum0_ cross join xwikiobjects baseobject1_ cross join xwikilists >>> dbstringli2_ inner join xwikiproperties dbstringli2_1_ on >>> dbstringli2_.XWL_ID=dbstringli2_1_.XWP_ID and >>> dbstringli2_.XWL_NAME=dbstringli2_1_.XWP_NAME inner join xwikilistitems >>> list3_ on dbstringli2_.XWL_ID=list3_.XWL_ID and >>> dbstringli2_.XWL_NAME=list3_.XWL_NAME where (xwikidocum0_.XWD_HIDDEN<>1 >> oy >>> xwikidocum0_.XWD_HIDDEN is null) and >>> baseobject1_.XWO_CLASSNAME=\'XWiki.TagClass\' and >>> baseobject1_.XWO_NAME=xwikidocum0_.XWD_FULLNAME and >>> baseobject1_.XWO_ID=dbstringli2_.XWL_ID and >> dbstringli2_.XWL_NAME=\'tags\' >>> and lower(list3_.XWL_VALUE)=lower(\'shock-rock\') order by >>> xwikidocum0_.XWD_FULLNAME' >>> >>> Anyone knows what is the component that is responsible for this query? >> for >>> each new tag, this kind of query is executed to create sort index? >>> >>> Thanks >>> >>> >>> 2014-08-23 3:46 GMT-03:00 O.J. Sousa Rodrigues <[email protected]>: >>> >>>> Wouldn't this be a perfect case for a NoSQL-DB like Neo4J? >>>> Am 22.08.2014 23:13 schrieb "Paul Libbrecht" <[email protected]>: >>>> >>>>> Danilo, >>>>> >>>>> have you checked the MySQL process list? >>>>> I'd suspect something is hogging. >>>>> For search, I'd recommend to leverage solr… but with an amount of >>>>> customizations. There are some hooks in the solr-plugin, I believe. >>>>> >>>>> hope it helps. >>>>> >>>>> paul >>>>> >>>>> >>>>> On 22 août 2014, at 22:54, Danilo Oliveira <[email protected] >>> >>>>> wrote: >>>>> >>>>>> Hello Devs, >>>>>> >>>>>> I am developing an application based on XWiki that is mapping, >>>>> connecting, >>>>>> relating and graphical disposing movie information in order to make >>>>>> possible to the user explore their trailers. >>>>>> >>>>>> At the beginning with a light data set (<5k movies) the application >> was >>>>>> running well, but today I started to populate my database (MYSQL) and >>>> the >>>>>> application became unusable, the queries is taking more than 5 minutes >>>> to >>>>>> complete. Actually, it has more than 15k movies (1 movie = 1 doc) and >> I >>>>>> need to upload more 100k. >>>>>> >>>>>> I already have checked the cache and performance page but I don't know >>>> if >>>>>> they[1][2] solve my problem: >>>>>> I think that is a architecture challenge. >>>>>> >>>>>> My AS IS process is: >>>>>> -User insert a movie, >>>>>> -the application search for the movie and their related films based on >>>>> its >>>>>> characteristics (a lot of joins and other algorithms) (bottleneck) >>>>>> -the application returns the results as a map; >>>>>> >>>>>> I am wondering if I could use the custom mapping[3] to solve my >> problem >>>>> due >>>>>> the fact that the relationship information for each movie, in this >>>> first >>>>>> moment, don't need to change often. Each movie has X movies related, >>>>> sorted >>>>>> by similarity. So, I could create some relationship algorithm that >> will >>>>> run >>>>>> scheduled ( 1 time by week) and populate this new table .I am thinking >>>> to >>>>>> use dataframe panda of python to talk directlly with mysql and make >>>> data >>>>>> analysis, any other suggestion? >>>>>> >>>>>> So I would create a custom map to my relationship movie class, run the >>>>>> algorithm, populate the new table, so my TO BE would be: >>>>>> >>>>>> TO BE >>>>>> -user insert movie info; >>>>>> -simple select on the customtable "MoviesRelated"; >>>>>> -the application returns the results; >>>>>> >>>>>> I would appreciate some opinion. Thank you very much. >>>>>> >>>>>> [1]http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Performances >>>>>> [2]http://extensions.xwiki.org/xwiki/bin/view/Extension/Cache+Module >>>>>> [3]http://platform.xwiki.org/xwiki/bin/view/DevGuide/CustomMapping >>>>>> >>>>>> Danilo >>>>>> -- >>>>>> Danilo Amaral de Oliveira >>>>>> Engenheiro de Computação >>>>>> celular (32) 9111 - 6867 >>>>>> _______________________________________________ >>>>>> devs mailing list >>>>>> [email protected] >>>>>> http://lists.xwiki.org/mailman/listinfo/devs >>>>> >>>>> _______________________________________________ >>>>> devs mailing list >>>>> [email protected] >>>>> http://lists.xwiki.org/mailman/listinfo/devs >>>>> >>>> _______________________________________________ >>>> devs mailing list >>>> [email protected] >>>> http://lists.xwiki.org/mailman/listinfo/devs >>>> >>> >>> >>> >> >> >> >> mit freundlichen Grüßen >> Clemens Klein-Robbenhaar >> >> -- >> Clemens Klein-Robbenhaar >> Software Development >> EsPresto AG >> Breite Str. 30-31 >> 10178 Berlin/Germany >> Tel: +49.(0)30.90 226.763 >> Fax: +49.(0)30.90 226.760 >> [email protected] >> >> HRB 77554 B - Berlin-Charlottenburg >> Vorstand: Maya Biersack, Peter Biersack >> Vorsitzender des Aufsichtsrats: Dipl.-Wirtsch.-Ing. Winfried Weber >> Zertifiziert nach ISO 9001:2008 >> _______________________________________________ >> devs mailing list >> [email protected] >> http://lists.xwiki.org/mailman/listinfo/devs >> > > > > -- > Danilo Amaral de Oliveira > Engenheiro de Computação > celular (32) 9111 - 6867 > _______________________________________________ > devs mailing list > [email protected] > http://lists.xwiki.org/mailman/listinfo/devs _______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs

