[Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread H. Langos
Seems like (at least) the API of #pos in ParserFunctions is
different from the one in StringFunctions.

{{#pos: haysack|needle|offset}}

While the StringFunctions #pos in MediaWiki 1.14 returned an
empty string when the needle was not found, the ParserFunctions
implementation of #pos in svn now returns -1.

This is most unfortunate since current usage depends on this.
Example:

{{#if: {{#pos: abcd|b}} | found | not found }}

{{#if: {{#pos: abcd|x}} | found | not found }}

Now both of these example will return found!


Usage scenario:

I try to use #pos in template calls to implement a sort-of-database
functionality in a mediawiki.

I have a big template that contains data in named parameters.
those parameters get passed along to a template that can select columns
by rendering some of those named parameters and ignoring others.

Now I want to implement row selection by passing along a parameter name
and a substring that should be in the value of that parameter in order
for the data to be rendered.

something like this:

{{#if: {{#pos: {{{ {{{selectionattribute}}} }}} | {{{selectionvalue}}} }} | 
render_row | render_nothing }}

If I want this to work in different MediaWiki installations I need
to rely on the API of #pos.

Currently there is seems to be no way to use #pos in a way that works 
on 1.14 and on 1.15-svn.

cheers
-henrik


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread H. Langos
On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
 Seems like (at least) the API of #pos in ParserFunctions is
 different from the one in StringFunctions.
 
 {{#pos: haysack|needle|offset}}
 
 While the StringFunctions #pos in MediaWiki 1.14 returned an
 empty string when the needle was not found, the ParserFunctions
 implementation of #pos in svn now returns -1.
 

I forgot to ask THE question. Is it a bug or is there some good reason 
to break backward compatibility?

And no, programming language cosmetics is not a good reason. :-)

If something has the same interface, it should have the same behaviour. 
If the old semantics was too awful to bare, the new one should have been
called #strpos or #fpos (for forward-#pos. #rpos always had the 
-1 return on no-found behaviour).


cheers
-henrik


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 6:55 AM, H. Langos henrik...@prak.org wrote:
 Seems like (at least) the API of #pos in ParserFunctions is
 different from the one in StringFunctions.

 {{#pos: haysack|needle|offset}}

 While the StringFunctions #pos in MediaWiki 1.14 returned an
 empty string when the needle was not found, the ParserFunctions
 implementation of #pos in svn now returns -1.
snip

Prior to the merge 100% of the StringFunction function calls were
reimplemented, principally for performance and security reasons.

The short but uninspired answer to your question is that in doing that
I didn't notice that #pos and #rpos had different default behavior.
Given the way that #if works, returning empty string is a reasonable
response to a string-not-found condition, and I am happy to change
that back.  I'll also recheck to make sure there aren't any other
unexpected behavioral changes.

Though they don't have to have the same behavior, I'd be inclined to
argue that #pos and #rpos really ought to have the same default
behavior on usability grounds, i.e. either both giving -1 or both
giving empty string when a match is not found.  Though since that does
create compatibility issues with existing StringFunctions users, I'll
defer to others about whether consistency would be a good enough
motivation in this case.


I should warn you though that there is an intentional behavioral
change regarding the handling of strip markers.  The pre-existing
StringFunctions codebase reacted to strip markers in a way that was
inefficient, hard for the end user to predict, and in specially
crafted cases created security issues.

The following example is illustrative of the change.

Consider the string ABCnowikijkl/nowikiDEFnowikimno/nowikiGHI

In the new implementation this is treated internally as ABCDEFGHI by
the string routines.  Hence it's length is 9 and it's first five
characters are ABCDE.

For complicated reasons the StringFunctions version says its length is
7 and the first five characters are ABCjklDEFmnoG.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread H. Langos
On Thu, Jun 04, 2009 at 05:05:50PM +0100, Andrew Garrett wrote:
 
 On 04/06/2009, at 3:46 PM, H. Langos wrote:
 
  On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
  Seems like (at least) the API of #pos in ParserFunctions is
  different from the one in StringFunctions.
 
  {{#pos: haysack|needle|offset}}
 
  While the StringFunctions #pos in MediaWiki 1.14 returned an
  empty string when the needle was not found, the ParserFunctions
  implementation of #pos in svn now returns -1.
 
 
  I forgot to ask THE question. Is it a bug or is there some good reason
  to break backward compatibility?
 
  And no, programming language cosmetics is not a good reason. :-)
 
  If something has the same interface, it should have the same  
  behaviour.
  If the old semantics was too awful to bare, the new one should have  
  been
  called #strpos or #fpos (for forward-#pos. #rpos always had the
  -1 return on no-found behaviour).
 
 This should be left as a comment on the relevant revision in  
 CodeReview. Note that it's likely irrelevant anyway, as, in all  
 likelihood, the merge of String and Parser Functions will be reverted.

Sorry to bother you but I am not a wikimedia developer so I wouldn't know
where to start looking.

Could you point me to the right place/list/article? The svn revision with 
the String and Parser Functions merge was 50997.

cheers
-henrik


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 9:05 AM, Andrew Garrett agarr...@wikimedia.org wrote:

 On 04/06/2009, at 3:46 PM, H. Langos wrote:

 On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
 Seems like (at least) the API of #pos in ParserFunctions is
 different from the one in StringFunctions.

 {{#pos: haysack|needle|offset}}

 While the StringFunctions #pos in MediaWiki 1.14 returned an
 empty string when the needle was not found, the ParserFunctions
 implementation of #pos in svn now returns -1.


 I forgot to ask THE question. Is it a bug or is there some good reason
 to break backward compatibility?

 And no, programming language cosmetics is not a good reason. :-)

 If something has the same interface, it should have the same
 behaviour.
 If the old semantics was too awful to bare, the new one should have
 been
 called #strpos or #fpos (for forward-#pos. #rpos always had the
 -1 return on no-found behaviour).

 This should be left as a comment on the relevant revision in
 CodeReview. Note that it's likely irrelevant anyway, as, in all
 likelihood, the merge of String and Parser Functions will be reverted.

Two devs, who shall remain nameless unless they choose to take credit
for it, explicitly encouraged the merge.  Personally, I've always
thought it made more sense to keep these as separate extensions but I
went along with what they encouraged me to do.

Regardless of whether it is one extension or two, I do strongly feel
that once a technically acceptable implementation of string functions
exists then it should be enabled on WMF sites.  (I agree though that
the previous StringFunctions was rightly excluded due to
implementation problems.)

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Aryeh Gregor
On Thu, Jun 4, 2009 at 12:05 PM, Andrew Garrettagarr...@wikimedia.org wrote:
 Note that it's likely irrelevant anyway, as, in all
 likelihood, the merge of String and Parser Functions will be reverted.

Have Tim or Brion said this?
https://bugzilla.wikimedia.org/show_bug.cgi?id=6455#c36 is the only
clear statement I've seen by either of them that I can recall.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Aryeh Gregor
On Thu, Jun 4, 2009 at 2:29 PM, Brianbrian.min...@colorado.edu wrote:
 I was privy to a #mediawiki conversation between brion/tim where tim pointed
 out that at least one person plans to implement a Natural Language
 Processing parser for English using StringFunctions just as soon as they are
 enabled.

 It's pretty obvious that you can implement all sorts crazy algorithms using
 StringFunctions. They need to be limited so that is not possible.

Note, though, that there are some that are already possible to some
extent.  You can use the core padright/padleft functions to emulate a
couple of the added functions.  E.g.:

http://en.wikipedia.org/w/index.php?title=Template:Str_lenaction=edit

The most template-heavy pages already tend to run close to the
template limits, until they're cut down by users when they fail.  It's
not clear to me that allowing more functions would actually increase
overall load or template complexity significantly.  It might decrease
it by allowing simpler and more efficient implementations of things
that currently need to be worked around.  It can't really increase it
too much, theoretically -- that's what the template limits are for.

Werdna points out that Tim did say this morning in #mediawiki that
he'd probably revert the change.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 11:29 AM, Brian brian.min...@colorado.edu wrote:
 I was privy to a #mediawiki conversation between brion/tim where tim pointed
 out that at least one person plans to implement a Natural Language
 Processing parser for English using StringFunctions just as soon as they are
 enabled.

 It's pretty obvious that you can implement all sorts crazy algorithms using
 StringFunctions. They need to be limited so that is not possible.

If you are referring to the conversation I think you are, then my
impression was Tim was speaking hypothetically about the issue rather
than knowing someone that had this specific intent.

I'm fairly dubious about anyone actually trying natural language
processing to any serious degree.  Real natural language processing
needs huge lookup tables to identify part of speech and relationships
etc.  Technically possible I suppose, but not easy to do.

I'm even more dubious that full fledged natural language processing --
in templates -- would find significant uses.  It is more efficient and
more practical to view templates as simple formatting macros rather
than as a system for real natural language interaction.  There are
very useful things that can be done with simple string algorithms,
such as detecting the (bar) when given a title like Foo (bar), but
I wouldn't expect anyone to be answering queries with them or anything
like that.

When providing tools to content creators, flexibility is generally a
positive design feature.  We shouldn't go overboard with imposing
limits in the advance of actual problems.

The current implementation is artificially limited to 1000 characters
or less, which does prevent huge manipulations, however.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 11:52 AM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 Note, though, that there are some that are already possible to some
 extent.  You can use the core padright/padleft functions to emulate a
 couple of the added functions.  E.g.:

 http://en.wikipedia.org/w/index.php?title=Template:Str_lenaction=edit
snip

I would like to note for the record that Brion explicitly endorsed
the padleft hack to the degree that he re-enabled it after Werdna had
removed it. [1]

Maybe he'd change his mind after looking at how the string
manipulation templates are actually getting used (now in 20,000
enwiki pages and counting), but for the moment he seems to have
supported allowing some form of hacked together string manipulation
system into Mediawiki.  To that end it makes more sense to have a real
string implementation rather than the ridiculous templates we have
now.

-Robert Rohde

[1] http://svn.wikimedia.org/viewvc/mediawiki?view=revrevision=47411

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Andrew Garrett

On 04/06/2009, at 8:03 PM, Robert Rohde wrote:
 I would like to note for the record that Brion explicitly endorsed
 the padleft hack to the degree that he re-enabled it after Werdna had
 removed it. [1]

 Maybe he'd change his mind after looking at how the string
 manipulation templates are actually getting used (now in 20,000
 enwiki pages and counting), but for the moment he seems to have
 supported allowing some form of hacked together string manipulation
 system into Mediawiki.  To that end it makes more sense to have a real
 string implementation rather than the ridiculous templates we have
 now.

I wouldn't read that into it. I think it's better characterised as  
reverting attempts to create an arms race over the hacks.

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l