Re: [Wikitech-l] Need a way to modify text before indexing (was SearchUpdate)

2015-10-14 Thread vitalif
FWIW, we do index the full text of (PDF and?) DjVu files on Commons (because it's stored in img_metadata). It's probably the biggest improvement CirrusSearch brought for Commons. And we also index office documents via Tika (*.doc and similar). And I think it should not be a feature of the

[Wikitech-l] Need a way to modify text before indexing (was SearchUpdate)

2015-10-14 Thread vitalif
I've written about my problem ~2 years ago: http://wikitech-l.wikimedia.narkive.com/6G0YPmWQ/need-a-way-to-modify-text-before-indexing-was-searchupdate It seems I've lost the latest message, so I want to answer to it now: With lsearchd and Elasticsearch, we absolutely wouldn't want to munge

[Wikitech-l] Need a way to modify text before indexing (was SearchUpdate)

2014-01-14 Thread vitalif
Hi! Change https://gerrit.wikimedia.org/r/#/c/79025/ that was merged to 1.22 breaks my TikaMW extension - I used that hook to extract contents from binary files so the user can then search on it. Maybe you can add some other hook for this purpose? See also

Re: [Wikitech-l] On your python vs php talk

2013-07-28 Thread vitalif
It's not bad design. It's bad only theoretically and just different from strongly-typed languages. I like its inconsistent function names - for a lot of functions they're similar to C and in most cases they're very easy to remember, as opposed to some other languages, including python (!!).

Re: [Wikitech-l] ???!!! ResourceLoader loading extension CSS DYNAMICALLY?!!

2013-06-07 Thread vitalif
Hi! Sorry for not answering via normal Reply, it's because I'm getting messages in digests. But I want to say thanks for clarification and for position=top advice - it's OK with position=top. Thanks :) ___ Wikitech-l mailing list

[Wikitech-l] ???!!! ResourceLoader loading extension CSS DYNAMICALLY?!!

2013-06-05 Thread vitalif
Hello! I've got a serious issue with ResourceLoader. WHAT FOR it's made to load extension styles_ DYNAMICALLY using JavaScript? It's a very bad idea, it leads to page style flickering during load. I.e. first the page is displayed using only skin CSS and then you see how

[Wikitech-l] Publish-staying-in-editmode feature for WikiEditor

2013-05-24 Thread vitalif
Hello! I have implemented an idea for WikiEditor extension: replace step-by-step publish feature with another one - publish staying in edit mode via AJAX. You can see a demo at http://wiki.4intra.net/ if you want. It works simply by sending an API save article request while NOT closing the

Re: [Wikitech-l] Removing the Hooks class

2013-04-05 Thread vitalif
You can't cache program state and loaded code like that in PHP. We explicitly have to abuse the autoloader and develop other patterns to avoid loading unused portions of code because if we don't our initialization is unreasonably long. Yeah, I understand it, the idea was to serialize globals

Re: [Wikitech-l] Removing the Hooks class

2013-04-04 Thread vitalif
Hey, I'm curious what the list thinks of deprecating and eventually removing the Hooks class. Some relevant info: /** * Hooks class. * * Used to supersede $wgHooks, because globals are EVIL. * * @since 1.18 */

Re: [Wikitech-l] WebRequest and PHP bug 31892 fixed 6 years ago

2013-03-15 Thread vitalif
fixing bug 32621 is a todo. The first attempt failed and some tweaks are needed to use the PathRouter to fix that bug. PathRouter allows for the use of custom paths to expand. NamespacePaths is an example of one thing you can do (say giving Help: pages a /help/ path) but you could also apply

Re: [Wikitech-l] WebRequest and PHP bug 31892 fixed 6 years ago

2013-03-15 Thread vitalif
And what is the point of making pretty urls in case of MediaWiki? I think they're already pretty much pretty in MediaWiki :) /edit/$1 is slightly prettier than ?action=edit, but as I understand it doesn't affect anything, even like SEO. And I don't think /help/$1 is any better than /Help:$1 at

[Wikitech-l] WebRequest and PHP bug 31892 fixed 6 years ago

2013-03-13 Thread vitalif
Hello! WebRequest::getPathInfo() still depends on PHP bug 31892 fixed 6 years ago. I.e. WebRequest uses REQUEST_URI instead of mangled PATH_INFO which is not mangled since PHP 5.2.4. Yeah, Apache still replaces multiple /// with single /, but afaik it's done for REQUEST_URI as well as

Re: [Wikitech-l] WebRequest and PHP bug 31892 fixed 6 years ago

2013-03-13 Thread vitalif
I doubt Daniel would have introduced it if it was un-necessary or pointless, I believe from memory it was to improve the handling of paths over a wide range of set-ups and environments (where sometimes it would fail). You would need to git blame the file and find the revision where it was

Re: [Wikitech-l] Seemingly proprietary Javascript

2013-03-05 Thread vitalif
I would just like to note that while it may be silly or useless to insert licenses into minified JavaScript, it is nonetheless *legally required* to do so, regardless of the technical aspect of it. My 2 points - during my own research about free licenses, I've decided that for JS, a good

Re: [Wikitech-l] WikiEditor caching (??)

2013-02-18 Thread vitalif
It's also annoying that while the toolbar (normal or advanced) loads I can't type in the header (for section=new) or the edit area, at least on Firefox:* is this the same problem? (*) Might also be a recent regression: https://bugzilla.mozilla.org/show_bug.cgi?id=795232 Maybe... It's also

Re: [Wikitech-l] Creating WOFF files -- sub-setting larger fonts

2013-02-17 Thread vitalif
By the way, I've just tried to use ttf2woff from fontutils to convert Ubuntu TTF font to WOFF format for use in one of my projects. And the resulting WOFF produced by this utility is not usable in any Linux browsers (tried Firefox, Chrome, Opera). Don't know if it works on Windows. And at the

Re: [Wikitech-l] Creating WOFF files -- sub-setting larger fonts

2013-02-17 Thread vitalif
By the way, I've just tried to use ttf2woff from fontutils to convert Ubuntu TTF font to WOFF format for use in one of my projects. And the resulting WOFF produced by this utility is not usable in any Linux browsers (tried Firefox, Chrome, Opera). Don't know if it works on Windows. And at the

Re: [Wikitech-l] Creating WOFF files -- sub-setting larger fonts

2013-02-17 Thread vitalif
Fontforge has an option to export the fonts to WOFF format. Thanks, Fontforge worked even better than the online converter - usable WOFF, and the size is 50kb instead of 54kb :-) [1] http://code.google.com/p/sfntly/ [2] http://code.google.com/p/sfntly/wiki/MicroTypeExpress As I

Re: [Wikitech-l] WikiEditor caching (??)

2013-02-16 Thread vitalif
vita...@yourcmc.ru wrote 2013-02-14 21:38: Hello Wiki Developers! I have a question: I think it's slightly annoying that WikiEditor shows up only some moment after the editing page loads and that the textarea gets moved down (because WikiEditor is only built dynamically via JS). Do you think

Re: [Wikitech-l] Corporate needs are different (RE: How can we help Corporations use MW?)

2013-02-15 Thread vitalif
There are so many extensions useful to the enterprise but probably also so many which are not useful at all or not maintained and if I wanted to start a corporate wiki right now I would probably be very lost what to look at and how people do things, so it seemed like a good idea to list the

Re: [Wikitech-l] Corporate needs are different (RE: How can we help Corporations use MW?)

2013-02-14 Thread vitalif
I guess this would not directly solve any of the problems listed, but would it be helpful to bring back to life https://www.mediawiki.org/wiki/Enterprise_hub ? It was started by somebody an year or two ago but seems to have been abandoned at a draft stage. I am thinking if everybody adds some

[Wikitech-l] WikiEditor caching (??)

2013-02-14 Thread vitalif
Hello Wiki Developers! I have a question: I think it's slightly annoying that WikiEditor shows up only some moment after the editing page loads and that the textarea gets moved down (because WikiEditor is only built dynamically via JS). Do you think it's possible to cache the generated

Re: [Wikitech-l] Stable PHP API for MediaWiki ?

2013-02-12 Thread vitalif
I understand from your comments that keeping things stable and preserving compatibiliy HAS been a priority for core developers at least since Daniel's email. Is this really the case? If this is the case, it makes me wonder why I hear some complaints about it. Mariya, but did you hear that

Re: [Wikitech-l] Stable PHP API for MediaWiki ?

2013-02-11 Thread vitalif
1) removal of global $action 2) removal of Xml::hidden() 3) broken Output::add() (had to migrate to resource loader) 4) various parser tag bugs 5) removal of MessageCache::addMessage() 6) removal of ts_makeSortable() (javascript) 7) brokage of WikiEditor adaptation 8) MediaWiki:common.js no more

Re: [Wikitech-l] Corporate needs are different (RE: How can we help Corporations use MW?)

2013-02-08 Thread vitalif
1. A desire for a department to have their own space on the wiki. In our organisation (CUSTIS, Russia) we easily solve it by creating one primary wiki + separate ones for different departments. It's just a normal wiki family with shared code. Very simple solution without any extensions. The

Re: [Wikitech-l] Corporate needs are different (RE: How can we help Corporations use MW?)

2013-02-08 Thread vitalif
In practice, we have found this doesn't work well for us (with thousands of employees). Yeah, our company doesn't have thousands of employees :-) Each department winds up writing its own wiki page about the same topic (say, Topic X), and they're all different. So it means most of your

Re: [Wikitech-l] Why are we still using captchas on WMF sites?

2013-01-22 Thread vitalif
Per the previous comments in this post, anything over 1% precision should be regarded as failure, and our Fancy Captcha was at 25% a year ago. So yeah, approximately all, and our captcha is well known to actually suck. Maybe you'll just use recaptcha instead of fancycaptcha?

Re: [Wikitech-l] Why are we still using captchas on WMF sites?

2013-01-22 Thread vitalif
The problem is that reCaptcha (a) used as a service, would pass private user data to a third party (b) is closed source, so we can' t just put up our own instance. Has anyone reimplemented it or any of it? There's piles of stuff on Wikisource we could feed it, for example. OK, then we can take

Re: [Wikitech-l] Why are we still using captchas on WMF sites?

2013-01-22 Thread vitalif
Luke Welling WMF писал 2013-01-22 21:59: Even ignoring openness and privacy, exactly the same problems are present with reCAPTCHA as with Fancy Captcha. It's often very hard or impossible for humans to read, and is a big enough target to have been broken by various people. It's very good to

Re: [Wikitech-l] Why are we still using captchas on WMF sites?

2013-01-22 Thread vitalif
It's very good to discuss, but what are the other options to minimize spam? (maybe I know one: find XRumer authors and tear their arms off... :-)) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org

Re: [Wikitech-l] MediaWiki Extension Bundles and Template Bundles

2013-01-14 Thread vitalif
On 01/14/2013 10:20 AM, Yuvi Panda wrote: Is there a sort of 'Extension Bundle' that gets you baseline stuff that people who are used to wikipedia 'expect'? ParserFunctions and Cite come to mind, but I'm sure there are plenty others. I don't know if this would be relevant to your question,

Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?

2012-12-26 Thread vitalif
Actually registration is open to everyone now by simple form submission. So actually, any one developer could get any change they wanted merged. All they need to do is trivially register a second labs account. Okay, but current situation is also a problem, because with it reviewing and

[Wikitech-l] Fwd: Re: How to speed up the review in gerrit?

2012-12-22 Thread vitalif
Sorry, I've replied to Sumana directly instead of the mailing list. So now duplicating into the mailing list. Sumana Harihareswara писал 2012-12-19 22:30: Try these tips: https://www.mediawiki.org/wiki/Git/Code_review/Getting_reviews Sumana, it's all very good but: 1) I think it's not so

[Wikitech-l] How to speed up the review in gerrit?

2012-12-19 Thread vitalif
Hello! 28 SEPTEMBER I've pushed minor changes to the gerrit, to the Drafts extensions. Since then I've corrected two of them (uploaded patch set 2), but after that, nobody did the review. As I understand, Gerrit will abandon changes after a month of inactivity, and it will come tomorrow...

Re: [Wikitech-l] How to speed up the review in gerrit?

2012-12-19 Thread vitalif
Matma Rex писал 2012-12-19 15:01: You could add people as reviewers, or personally ask someone to review, prefereably someone who worked on the extension in the past. Okay, I've just done it... So, do you mean all committers just add random reviewers when they see no reaction?

Re: [Wikitech-l] How to speed up the review in gerrit?

2012-12-19 Thread vitalif
Antoine Musso писал 2012-12-19 16:19: Le 19/12/12 11:57, vita...@yourcmc.ru wrote: Hello! 28 SEPTEMBER I've pushed minor changes to the gerrit, to the Drafts extensions. Since then I've corrected two of them (uploaded patch set 2), but after that, nobody did the review. As I understand, Gerrit

Re: [Wikitech-l] Question about 2-phase dump

2012-11-25 Thread vitalif
Page history structure isn't quite immutable; revisions may be added or deleted, pages may be renamed, etc etc. Shelling out to an external process means when that process dies due to a dead database connection etc, we can restart it cleanly. Brion, thanks for clarifying it. Also, I want

[Wikitech-l] Question about 2-phase dump

2012-11-21 Thread vitalif
Hello! While working on my improvements to MediaWiki ImportExport, I've discovered a feature that is totally new for me: 2-phase backup dump. I.e. the first pass dumper creates XML file without page texts, and the second pass dumper adds page texts. I have several questions about it - what

Re: [Wikitech-l] Question about 2-phase dump

2012-11-21 Thread vitalif
Brion Vibber wrote 2012-11-21 23:20: While generating a full dump, we're holding the database connection open for a long, long time. Hours, days, or weeks in the case of English Wikipedia. There's two issues with this: * the DB server needs to maintain a consistent snapshot of data since