Re: [SMW-devel] [PATCH] Support LIKE in queries
* Markus Krötzsch [EMAIL PROTECTED] [2008-01-02 08:37]: On Sonntag, 30. Dezember 2007, Thomas Bleher wrote: * Markus Krötzsch [EMAIL PROTECTED] [2007-12-30 22:10]: OK, my conclusion now was to support the following syntax: [[property% *subs?r*]] where ? and * represent _ and % in SQL. I think this is fine generally, but now you cannot query for a literal * or ? anymore, AFAIK. I would not consider this to be a major issue, given that those characters are not too common in typical application strings, and given the fact that using ? still queries for some symbol in that place -- it seems to be very unlikely that too strings differ only in one position where the query string has a ?. So in most cases it will have the same hits anyway (yes, there are some cases that could be problematic [1] ;). Agreed. Anyway, I will leave this issue at rest until any user actually complains about this limitation. Here I have to respectfully disagree. It seems unwise to wait until someone complains, when there is already a patch resolving the issue. Why spend more time later on when the issue can just be fixed right now? OK, the regexes where not very readable, but it doesn't really make the code more complicated. FWIW, the regexes where so ugly only because backslashes have to be escaped twice for PHPs preg_replace (so a single \ becomes ). If we used ! as an escape sequence instead of \, the regexes would look like this (untested): $value = str_replace(array('%', '_'), array('!%', '!_'), $value); $value = preg_replace('/(?!!)((?:!!)*)\*/', '$1%', $value); // if there's an even number of \, change * to % $value = preg_replace('/(?!!)((?:!!)*)\?/', '$1_', $value); // ditto for ? and _ $value = preg_replace('/(?!!)((?:!!)*)!\*/', '$1*', $value); // if there's an odd number, * was escaped and should stay as is; but the last \ is removed $value = preg_replace('/(?!!)((?:!!)*)!\?/', '$1?', $value); // ditto for ? (?: ) is a subexpression for grouping, not capturing, (?! ) is zero-width negative look-behind (i.e. we make sure that the character before our match is not !). Regards, Thomas [1] http://de.wikipedia.org/wiki/Die_drei_%3F%3F%3F Not a huge deal, but before, a_b searched for a, followed by any char, followed by b, while a\_b searched for exactly a_b. Properly escaping everything gets messy rather quickly, as \ can also be escaped to query for a literal \, so you need translations like: ?= _ \? = ? \\? = \\_ \\\? = \\? The following regular expressions work fine for me, but unfortunately they are quite ugly: $value = str_replace(array('%', '_'), array('\%', '\_'), $value); // escape % and _ $value = preg_replace('/(?!)((?:)*)\*/', '$1%', $value); // if there's an even number of \, change * to % $value = preg_replace('/(?!)((?:)*)\?/', '$1_', $value); // ditto for ? and _ $value = preg_replace('/(?!)((?:)*)\*/', '$1*', $value); // if there's an odd number, * was escaped and should stay as is; but the last \ is removed $value = preg_replace('/(?!)((?:)*)\?/', '$1?', $value); // ditto for ? I think these should be added to SMW, so all characters can be queried. Regards, Thomas -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org signature.asc Description: Digital signature - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] Minor issues with inline errors
* Markus Krötzsch [EMAIL PROTECTED] [2007-12-12 21:07]: On Sonntag, 2. Dezember 2007, Thomas Bleher wrote: 67 foreach($properties as $singleprop) { 68 $dv = SMWFactbox::addProperty($singleprop,$value,$valueCaption); 69 } $dv is overwritten here on each iteration of the loop. This looks fishy. Yes, but normally there is only one iteration anyway. What would you suggest instead? Hmm, should nested properties be allowed here? FWIW, the regexp is $semanticLinkPattern = '/\[\[ # Beginning of the link (([^:][^]]*):[=:])+ # Property name (can be nested?) ( # After that: (?:[^|\[\]] # either normal text (without |, [ or ]) |\[\[[^]]*\]\]# or a [[link]] |\[[^]]*\]# or an [external link] )*) # all this zero or more times (\|([^]]*))?# Display text (like text in [[link|text]]), optional \]\]# End of link /x'; (I took the liberty of modifying it to make it more readable) If nested properties should not be supported, all is fine, as $property is just ([^:][^]]*), ie without the trailing :: or :=. Then the preg_split and the for loop can be removed (OK, maybe the regexp could be made more strict, but that's another issue). If nested properties should be supported, this code is buggy, but I do not know what the correct semantics would be anyway. Regards, Thomas signature.asc Description: Digital signature - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] ask query format=template
* cnit [EMAIL PROTECTED] [2007-12-03 16:24]: May I suggest another kind of ask format? Sometimes it's desirable to get only a simple count of rows instead of the query result rows. So, if there's a 5 rows in the query result, ask format=count would return a number 5. It may be useful to statistics, in further computations in templates and so on... format=count already exists, since at least SMW 0.7. I really think that SMW requires a better documentation on new formats of queries, #ask and subqueries. Because the only documentation I've used is outdated: http://meta.wikimedia.org/wiki/Help:Substitution e.g. no new features, no subqueries here and so on.. ??? This page mentions SMW, but it's hardly related to it. Have you looked at ontoworld.org or http://semantic-mediawiki.org/? I agree that the wiki pages there need improvement, and hopefully that will happen once 1.0 is released (personally, I'm currently refraining from doing any work on the site, because many things have changed between 0.7 and 1.0 and it's not so easy to separate it; I hope 1.0 will be released soon, so the old information about 0.7 can be replaced) But such documentation doesn't write itself. Maybe you can rework the wiki pages on semantic-mediawiki.org with the things you have already learned about SMW 1.0. That would surely be appreciated. Regards, Thomas signature.asc Description: Digital signature - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
[SMW-devel] Minor issues with inline errors
Hi! Stumbled across another issue today: If an annotation is incorrect an error text is added, even if $smwgNamespacesWithSemanticLinks is set to false for this particular namespace. Example: http://www.ppoe.at/leiter/wiwo/wiki/index.php/MediaWiki:Neuemethode-grundgeruest (This page is read by a special page where all the funny values are replaced by real data) I looked into the code (includes/SMW_Hooks.php) but found no really nice solution - basically, the code in smwfParserHook $text = preg_replace_callback($semanticLinkPattern, 'smwfParsePropertiesCallback', $text); should pass either $parser or the value of smwfIsSemanticsProcessed($parser-getTitle()-getNamespace()) to smwfParsePropertiesCallback() so it can remove the error message if the namespace should have no semantic links. But as PHP supports neither callbacks with additional arguments nor closures, we are a bit stuck. Using $wgParser in the callback function is possible, though potentially buggy, as smwfParserHook may be called with another parser than the global one. (I have an extension that does this :-/ ) Suggestions welcome. Two other questions: 65 //extract annotations and create tooltip 66 $properties = preg_split('/:[=|:]/', $property); This also matches :|, because | is not special in character classes. I think what you want is '/:[=:]/'. 67 foreach($properties as $singleprop) { 68 $dv = SMWFactbox::addProperty($singleprop,$value,$valueCaption); 69 } $dv is overwritten here on each iteration of the loop. This looks fishy. Regards, Thomas signature.asc Description: Digital signature - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
[SMW-devel] Another small issue: decimal separator is locale dependent
I just discovered another small issue: I tried to adapt http://ontoworld.org/wiki/Type:Time to my wiki (see http://spiele.j-crew.de/wiki/Datentyp:Zeitdauer). First I copied the page verbatim, which lead to various strange errors (division by zero, numbers being sometimes shown in a strange way, ...) until I found out that the numbers are parsed based on locale, so I had to replace every . with a ,. Now I'm not sure if this is a bug, as locale awareness is generally a good thing, but it should be noted in the documentation, and maybe SMW can also flag the error (unrecognized number format or something like that). Regards, Thomas signature.asc Description: Digital signature - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] [patch] 2nd try: support for negated queries
* Markus Krötzsch [EMAIL PROTECTED] [2007-11-23 16:14]: thanks for the work (and good to hear the SpieleWiki runs on the new version)! We agree and also would prefer the full negation patch. Our current roadmap is to finish 1.0 first, which mainly hinges upon the inclusion (or reimplementation) of the {{#ask}}-patch and some further fixes (service links, Type:Boolean). I think that's okay. I can live with the patch being out-of-tree for a while (I had patched SMW 0.7 more heavily than I have 1.0RC2 now, so things are improving for me :) After that, we will go for new releases and new features. I would expect that your patch remains compatible for that time -- major changes of querying as in 0.7-1.0 are not scheduled so soon. What might be revisited, however, is the concrete syntax used for expressing negations. OK. Just one note, as I'm not sure it was clear from the mail I sent: The present patch allows n at every place where q is allowed. It acts the same (starts a subquery) except that it negates everything. If I can manage to review your patch before 1.0, then I will certainly add it right away -- the fact that you successfully worked through our Storage API already impresses me quite a lot :-) Thanks :) And, given that you are one of the few people who ever actually read that code: we are also happy about any suggestions towards improving the storage performance. Btw another possibility to work-around your problem for now could be to filter for the respective property values *on printout*. This is inefficient but OK as long as the result set is not too large. To do this, one would use a templated printout (format=template template=...) and include an {{#if }} into the template that checks the conditions to decide whether or not to create any output. Note that one can also use format=template for printing tables by setting intro={| and using a template that defines table-rows in wiki pipe syntax. It is also possible to use a template call for intro, so as to get styling parameters set. Of course it's as ugly as any advanced formatting in MediaWiki ... Ah :) I had used this, but with an extra outro parameter to close the table again. Didn't realize that MediaWiki's parser allows the closing tag to be omitted. One patch less :) The only addition I need there is the following patch: --- a/SMW_QP_Template.php +++ b/SMW_QP_Template.php @@ -25,6 +25,7 @@ class SMWTemplateResultPrinter extends SMWResultPrinter { } public function getHTML($res) { + global $wgParser; // handle factbox global $smwgStoreActive, $wgTitle; @@ -41,6 +42,8 @@ class SMWTemplateResultPrinter extends SMWResultPrinter { $parser_options = new ParserOptions(); $parser_options-setEditSection(false); // embedded sections should not have edit links $parser = new Parser(); + $parser-mFunctionHooks = $wgParser-mFunctionHooks; + $parser-mFunctionSynonyms = $wgParser-mFunctionSynonyms; while ( $row = $res-getNext() ) { $wikitext = ''; $firstcol = true; It copies the parser functions from the main parser, so they can be used in the templates given to ask. Maybe this can be added to SMW. Regards, Thomas On Dienstag, 20. November 2007, Thomas Bleher wrote: * Thomas Bleher [EMAIL PROTECTED] [2007-11-15 20:36]: * Thomas Bleher [EMAIL PROTECTED] [2007-11-14 08:48]: 1) Implement negations in queries - then I could ask (NOT maxTeilnehmer = 9) AND (NOT minTeilnehmer = 11). I used this solution with SMW 0.7 (patch below), but looking at the 1.0RC2 code, I'm not quite sure how to implement it cleanly. I looked at the code again and tried to implement negation support. The patch below adds a n /n pair which negates the query in between. It also adds a SMWNegation class (derived from SMWDescription), which encapsulates the negation. OK, I took another stab at this, and in my limited tests the code works correctly now :) The code is online at http://pwiki.j-crew.de/wiki/Test, feel free to play around with it (it's just a test-wiki). Technical details: To work around the problems with INNER JOINs and negations, all INNER JOINs of the form SELECT a INNER JOIN b ON a.c=b.d are replaced by SELECT a,b WHERE a.c=b.d. This may not be the most elegant solution, but the only workable I found. To make this work, addJoin now has an additional $where parameter. Beware: I still don't understand all the internals of query processing, so no guarantees for this code. I would really appreciate it if someone more knowledgeable than me would look over the code. Regards, Thomas -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL
[SMW-devel] Small Bug: Page saves fail when SMW is loaded but tables are missing
When SMW is loaded via require_once() and enableSemantics(), but the necessary database tables are not there, saving a wiki page can fail. This hit me today: I had installed SMW for a client I work for, but Special:SMWAdmin did not have the necessary rights to create the tables. I left SMW there so the DB admins could add the rights, call the special page, and immediately after, disable the DB rights again. Today I got the mail that page saves failed. I know that this is a small issue, and maybe it's not worth fixing, but I think it would be nice if saving did not totally fail if the necessary tables are not present at all, but continue with a warning. After all, the information can later be regenerated. Regards, Thomas - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
[SMW-devel] [broken patch] First stab at support for negated queries
* Thomas Bleher [EMAIL PROTECTED] [2007-11-14 08:48]: 1) Implement negations in queries - then I could ask (NOT maxTeilnehmer = 9) AND (NOT minTeilnehmer = 11). I used this solution with SMW 0.7 (patch below), but looking at the 1.0RC2 code, I'm not quite sure how to implement it cleanly. I looked at the code again and tried to implement negation support. The patch below adds a n /n pair which negates the query in between. It also adds a SMWNegation class (derived from SMWDescription), which encapsulates the negation. The patch is --- a/extensions/SemanticMediaWiki/includes/SMW_QueryProcessor.php +++ b/extensions/SemanticMediaWiki/includes/SMW_QueryProcessor.php @@ -296,6 +296,14 @@ class SMWQueryParser { $conjunction = $this-addDescription($conjunction, $this-getSubqueryDescription($setsubNS, $label)); /// TODO: print requests from subqueries currently are ignored, should be moved down break; + case 'n': // negated subquery + $this-pushDelimiter('/n'); + $setsubsubNS = false; + $sublabel = ''; + $conjunction = $this-addDescription($conjunction, new SMWNegation($this-getSubqueryDescription($setsubsubNS, $sublabel))); + /// TODO: print requests from negations currently are ignored, should be moved down + case '/n': + break; case '||': case '': case '/q': // finish disjunction and maybe subquery if ($this-m_defaultns !== NULL) { // possibly add namespace restrictions if ( $hasNamespaces !$mustSetNS) { @@ -738,7 +746,7 @@ class SMWQueryParser { */ protected function readChunk($stoppattern = '', $consume=true) { if ($stoppattern == '') { - $stoppattern = '\[\[|\]\]|::|:=|q|\/q|^' . $this-m_categoryprefix . '|\|\||\|'; + $stoppattern = '\[\[|\]\]|::|:=|q|\/q|n|\/n|^' . $this-m_categoryprefix . '|\|\||\|'; } $chunks = preg_split('/[\s]*(' . $stoppattern . ')[\s]*/', $this-m_curstring, 2, PREG_SPLIT_DELIM_CAPTURE); if (count($chunks) == 1) { // no matches anymore, strip spaces and finish --- a/extensions/SemanticMediaWiki/includes/storage/SMW_Description.php +++ b/extensions/SemanticMediaWiki/includes/storage/SMW_Description.php @@ -763,3 +763,31 @@ class SMWSomeProperty extends SMWDescription { } } +/** + * Description of a negation of a description. + */ +class SMWNegation extends SMWDescription { + protected $m_description; + + public function SMWNegation(SMWDescription $description) { + $this-m_description = $description; + } + + public function getDescription() { + return $this-m_description; + } + + + public function getQueryString() { + return 'lt;ngt;' . $this-m_description-getQueryString() . 'lt;/ngt;'; + } + + public function isSingleton() { + return false; + } + + public function getDepth() { + return $this-m_description-getDepth() + 1; + } + +} --- a/extensions/SemanticMediaWiki/includes/storage/SMW_SQLStore.php +++ b/extensions/SemanticMediaWiki/includes/storage/SMW_SQLStore.php @@ -1687,6 +1687,18 @@ class SMWSQLStore extends SMWStore { $subwhere = ''; } } + } elseif ($description instanceof SMWNegation) { + $intpagetable = $db-tableName('page'); + $intfrom = $intpagetable; + $intwhere = ''; + $intcurtables = array('PAGE' = $intfrom); + $this-createSQLQuery($description-getDescription(), $intfrom, $subwhere, $db, $intcurtables, $nary_pos); + if ($subwhere != '') { + if ($where != '') { + $where .= ' AND '; + } + $where .= 'NOT EXISTS (SELECT 1 FROM '.$intfrom.' WHERE ' . $subwhere . ')'; + } } elseif ($description instanceof SMWSomeProperty) { $id = SMWDataValueFactory::getPropertyObjectTypeID($description-getProperty()); $sort = false; Unfortunately, in the limited time I had, I couldn't get the code to output correct SQL, so I just hope that someone can pick up the pieces and rework it into something functioning. I'm also not really sure what
[SMW-devel] Allow queries like (known_value or empty)?
Hi! First I'd like to thank you for the great software that SMW is! SMW1.0 seem to be in pretty good form, too! I tried to upgrade my server (which has some custom modifications to SMW) and hit a limitation, which I'm not quite sure how to overcome. Maybe you have some suggestions. I'd like to query attributes in a way where the attribute either has a certain value or is empty. My use-case: I have a wiki for games, with each game encoding the number of players possible in two attributes: minTeilnehmer and maxTeilnehmer. Both of these may be empty, if there is no minimum number of players, or if the number of players is unlimited. Now I'd like to answer the question: If I have 10 people, which games can I play? (Assuming that, if both attributes are available, minTeilnehmer = maxTeilnehmer) As far as I know, this question cannot be answered currently in SMW. I see several solutions to this problem: 1) Implement negations in queries - then I could ask (NOT maxTeilnehmer = 9) AND (NOT minTeilnehmer = 11). I used this solution with SMW 0.7 (patch below), but looking at the 1.0RC2 code, I'm not quite sure how to implement it cleanly. 2) Add a query that a certain property is not defined for a given page; I tried [[minTeilnehmer::!+]], but that doesn't work. If that worked, the query could be formulated as ([[minTeilnehmer::10]]|| [[minTeilnehmer::!+]]) ([[maxTeilnehmer::10]][[maxTeilnehmer::!+]]) 3) Set minTeilnehmer to 0 or 1 (OK) and set maxTeilnehmer to some large value (ugly). If SMW supported infinity (∞) as a value, this would be much nicer. Another possibility would be to set maxTeilnehmer to 0 in that case, use a disjunct query (0 OR = 10) and hide the zero. I'd prefer 1) over 2) over 3), but I'm not sure about the database overhead for large tables - the implementation below uses a NOT EXISTS (SELECT * FROM $table WHERE $condition) for the negation. I'm not database guru, so no idea how bad this really is. Any suggestions? Thomas BTW: You can see an example of the data at http://spiele.j-crew.de/wiki/Kategorie:Quiz The query interface is http://spiele.j-crew.de/wiki/Spezial:Spielesuche an example query is http://spiele.j-crew.de/wiki/Spezial:Spielesuche?ort=Stadtspielteilnehmerzahl=10leiterzahl=gruppe=13-15Jsuche=Suche (Sorry, the server is quite slow right now) PS: The code I used with SMW 0.7 is approximately: --- a/includes/SMW_InlineQueries.php +++ b/includes/SMW_InlineQueries.php @@ -651,12 +669,19 @@ class SMWInlineQuery { } } elseif ( ($this-mConditionCount $smwgIQMaxConditions) ($this-mTableCount $smwgIQMaxTables) ) { // conjunct is a real condition $sq_title = ''; + $negated = false; + $table = ''; + if (mb_substr($qparts[2],0,1) == '!') { + $negated = true; + $qparts[2] = mb_substr($qparts[2],1); + } + if (mb_substr($qparts[2],0,1) == '+') { // sub-query or wildcard search $subq_id = mb_substr($qparts[2],1); if ( ('' != $subq_id) (array_key_exists($subq_id,$this-mSubQueries)) ) { $sq = $this-parseQuery($this-mSubQueries[$subq_id]); if ( ('' != $sq-mConditions) ($this-mConditionCount $smwgIQMaxConditions) ($this-mTableCount $smwgIQMaxTables) ) { - $result-mTables .= ',' . $sq-mTables; + $table = $sq-mTables; if ( '' != $result-mConditions ) $result-mConditions .= ' AND '; $result-mConditions .= '(' . $sq-mConditions . ')'; $sq_title = $sq-mSelect[1]; @@ -676,7 +701,7 @@ class SMWInlineQuery { $curtable = 't' . $this-mRename++; // alias for the current table if ($cat_sep == $op ) { // condition on category membership - $result-mTables .= ',' . $this-dbr-tableName('categorylinks') . AS $curtable; + $table = $this-dbr-tableName('categorylinks') . AS $curtable; $condition = $pagetable.page_id=$curtable.cl_from; // TODO: make subcat-inclusion more efficient foreach ($values as $idx = $v) { @@ -688,7 +713,7 @@ class SMWInlineQuery { } } elseif ('::'