Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Max Semenik
On Tue, Sep 19, 2017 at 8:37 PM, Tim Starling 
wrote:

> According to people on the ops team who have worked with them
> recently, they stopped working on the open source product altogether.
> They stopped responding to bug reports.


This makes it sound as if your estimation of one year to migrate off HHVM
is too long and we need to run for it even sooner.


-- 
Best regards,
Max Semenik ([[User:MaxSem]])
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Tim Starling
On 20/09/17 12:19, C. Scott Ananian wrote:
> On Sep 19, 2017 9:45 PM, "Tim Starling"  wrote:
> 
> Facebook have been inconsistent with
> HHVM, and have made it clear that they don't intend to cater to our needs.
> 
> 
> I'm curious: is this conclusion based on your recent meeting with them, or
> on past behavior?  Their recent announcement had a lot of "we know we
> haven't been great, but we promise to change" stuff in it ("reinvest in
> open source") and I'm curious to know if they enumerated concrete steps
> they planned to take, or whether even in your most recent meeting with them
> they failed to show actual interest.

"Have been inconsistent" refers to their past behaviour. "Don't intend
to cater" refers to the meeting and announcement.

According to people on the ops team who have worked with them
recently, they stopped working on the open source product altogether.
They stopped responding to bug reports. By "reinvest in open source"
they are apologising for that and promising to start reading their bug
mail again. This was discussed in the meeting.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
On Sep 19, 2017 9:45 PM, "Tim Starling"  wrote:

Facebook have been inconsistent with
HHVM, and have made it clear that they don't intend to cater to our needs.


I'm curious: is this conclusion based on your recent meeting with them, or
on past behavior?  Their recent announcement had a lot of "we know we
haven't been great, but we promise to change" stuff in it ("reinvest in
open source") and I'm curious to know if they enumerated concrete steps
they planned to take, or whether even in your most recent meeting with them
they failed to show actual interest.
  --scott
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Tim Starling
On 20/09/17 02:40, C. Scott Ananian wrote:
> For example, the top-line github stats are:
> hhvm: 504 contributors (24,192 commits)
> php-src: 496 contributors (104,566 commits)

There's a reason I've contributed loads of code to HHVM and hardly any
to PHP. Although the HHVM folks were sometimes tied up in internal
work and not available, when they were available, my interactions with
them were very pleasant. They were enthusiastic about my
contributions, and were happy to have long design discussions on IRC.
Code review sometimes had a lot of back and forth, but at least they
were positive and engaged. The people I dealt with in code review
usually had it as their job to accept community contributions. My bug
reports were respectfully treated.

It was a big contrast to my interactions with the PHP community, which
were so often negative. For example, Jani's toxic behaviour on the bug
tracker, closing bugs as "bogus" despite being serious and
reproducible, usually because he didn't understand them technically.
Even with other maintainers, I had to fight several times to keep
serious bugs open. I had no illusions that they would ever be fixed, I
just wanted them to be open for my reference and for the benefit of
anyone hitting the same issue. I filed bugs as "documentation issues",
requesting that undesired behaviour be documented in the manual, since
they were more likely to stay open that way.

My interactions with Derick Rethans were quite unpleasant, he would
not even consider accepting the code I wrote to fix a DoS
vulnerability which was affecting us constantly. He wouldn't provide a
code review, he just rejected it on principle. Instead he wrote his
own version of it a couple of years later. He seemed to think that
every line of code in the date module should be attributable to him.

Design discussions are apparently concentrated on the internals
mailing list, where there is an incredible amount of negativity for
any new idea. Developers really need a lot of energy to keep answering
negative comments, over and over for a period of months, in order to
get their RFCs accepted. Some language features which eventually made
it into PHP, such as short arrays, were shot down many times on the
mailing list before they found a champion who was sufficiently brave
and influential.

Their code review practices were quite archaic, I don't know if
they've improved. Their coding style is also dated.

Stas was great, a bright spot in a dismal field, which was why I was
so keen to hire him.

So I'm not looking forward to returning to PHP. But at least we know
what we are getting ourselves in for. Being community-driven means it
has inertia, change is slow. Facebook have been inconsistent with
HHVM, and have made it clear that they don't intend to cater to our needs.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Mapping Hiragana and Katakana

2017-09-19 Thread Trey Jones
>
> Anyway, would it be a big deal to show the transliterated results with
> less weight in ranking?


Doing any special weighting would be more difficult, but they would already
be naturally ranked lower for not being exact matches. (You can see this at
work if you compare the results for *resume, resumé,* and *résumé* on
English Wikipedia, for example.)

Actually, add an option button in advanced search in any case, and just
> limit discussion about should it be opt-in or opt-out.


There are longer term plans for revamping advanced search capabilities, so
if we want to go that route, it's doable, but it would definitely be on
hold for a while. Options that have been mentioned include a special case
keyword like "kana:オオカミ", or a more generic keyword like "phonetic:オオカミ"
that was smart enough to know what to do with kana, but might do something
different with other characters... but that's all at the vague ideation
stage right now.

Thanks!


Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation

On Tue, Sep 19, 2017 at 8:29 PM, mathieu stumpf guntz <
psychosl...@culture-libre.org> wrote:

>
>
> Le 19/09/2017 à 23:47, Trey Jones a écrit :
>
> We recently got a suggestion via Phabricator[1] to automatically map
> between hiragana and katakana when searching on English Wikipedia and other
> wiki projects. As an always-on feature, this isn't difficult to implement,
> but major commercial search engines (Google.jp, Bing, Yahoo Japan,
> DuckDuckGo, Goo) don't do that. They give different results when searching
> for hiragana/katakana forms (for example, オオカミ/おおかみ "wolf"). They also give
> different *numbers* of results, seeming to indicate that it's not just
> re-ordering the same results (say, so that results in the same script are
> ranked higher).[2] I want to know what they know that I don't!
>
> Does anyone have any thoughts on whether this would be useful (seems that
> it would) and whether it would cause any problems (it must, or otherwise
> all the other search engines would do it, right?).
>
> Well, maybe. Or not. Look how Duckduckgo continue to only give a
> "country" option to filter *languages*. Now both might be complementary,
> but personally I'm generally more interested with the later. All the more
> when
> I'm using a language which have no country using it as official language.
> :)
>
> Anyway, would it be a big deal to show the transliterated results with less
> weight in ranking? Actually, add an option button in advanced search in any
> case, and just limit discussion about should it be opt-in or opt-out.
>
> Any idea why it might be different between a Japanese-language wiki and a
> non-Japanese-language wiki? We often are more aggressive in matching
> between characters that are not native to a given language--for example,
> accents on Latin characters are generally ignored on English-language
> wikis. So it might make sense to merge hiragana and katakana on
> English-language wikis but not Japanese-language wikis.
>
> Thanks very much for any suggestions or information!
> —Trey
>
>
> どういたしました。
>
>
>
> [1] https://phabricator.wikimedia.org/T176197
> [2] Details of my tests at https://phabricator.wikimedia.org/T173650#3580309
>
> Trey Jones
> Sr. Software Engineer, Search Platform
> Wikimedia Foundation
> ___
> Wikitech-l mailing 
> listWikitech-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Mapping Hiragana and Katakana

2017-09-19 Thread mathieu stumpf guntz



Le 19/09/2017 à 23:47, Trey Jones a écrit :

We recently got a suggestion via Phabricator[1] to automatically map
between hiragana and katakana when searching on English Wikipedia and other
wiki projects. As an always-on feature, this isn't difficult to implement,
but major commercial search engines (Google.jp, Bing, Yahoo Japan,
DuckDuckGo, Goo) don't do that. They give different results when searching
for hiragana/katakana forms (for example, オオカミ/おおかみ "wolf"). They also give
different *numbers* of results, seeming to indicate that it's not just
re-ordering the same results (say, so that results in the same script are
ranked higher).[2] I want to know what they know that I don't!

Does anyone have any thoughts on whether this would be useful (seems that
it would) and whether it would cause any problems (it must, or otherwise
all the other search engines would do it, right?).

Well, maybe. Or not. Look how Duckduckgo continue to only give a
"country" option to filter *languages*. Now both might be complementary,
but personally I'm generally more interested with the later. All the 
more when

I'm using a language which have no country using it as official language. :)

Anyway, would it be a big deal to show the transliterated results with less
weight in ranking? Actually, add an option button in advanced search in any
case, and just limit discussion about should it be opt-in or opt-out.



Any idea why it might be different between a Japanese-language wiki and a
non-Japanese-language wiki? We often are more aggressive in matching
between characters that are not native to a given language--for example,
accents on Latin characters are generally ignored on English-language
wikis. So it might make sense to merge hiragana and katakana on
English-language wikis but not Japanese-language wikis.

Thanks very much for any suggestions or information!
—Trey


どういたしました。


[1] https://phabricator.wikimedia.org/T176197
[2] Details of my tests at https://phabricator.wikimedia.org/T173650#3580309

Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Is it possible to edit scribunto data module content through template edit popup of visual editor?

2017-09-19 Thread mathieu stumpf guntz



Le 19/09/2017 à 15:29, Brad Jorsch (Anomie) a écrit :

On Tue, Sep 19, 2017 at 2:48 AM, mathieu stumpf guntz <
psychosl...@culture-libre.org> wrote:

But having ability to write a limited amount of bytes in a single data

module per script call, and possibly others safeguard limits, wouldn't be
that risky, would it?


It would break T67258 . I also
think it's probably a very bad idea to be trying to have page parses make
edits to the wiki.

Well, actually, depending on what you mean with  have page parses make
edits to the wiki, I'm not sure what I'm looking for fall under this 
umbrella.


What I would like is a way to do something like

local data = mw.loadData( 'Module:Name/data/entry' )
-- do some stuff with `data`
mw.saveData( 'Module:Name/data/entry', data)

That's it.






If it's not, please provide me some feed back on the proposal to add such
a function, and if I should document such a proposal elsewhere, please let
me know.


You're free to file a task in Phabricator, but it will be closed as
Declined. There are too many potential issues there for far too little
benefit.
If you do think it's an horrible akward awfully disgusting idea , I 
would be interested to know more technical details on what problems it 
might lead to (not the global result of nightmarish hell on earth that 
I'm obviously targeting )


Your initial idea of somehow hooking into the editor (whether that's the
wikitext editor or VE) with JavaScript to allow humans to make edits to the
data module while editing another page was much better.
I didn't even thought about JS actually. For the wikitext removal of 
updating parameter, I had in mind some inplace template substitution.
Does javascript allow to change an other page on the wiki, especially a 
data module, at some point when the user edit/save an article?







___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Technical Debt SIG - Agenda

2017-09-19 Thread Jean-Rene Branaa
Hello All,

Over the past week there's been a significant increase in the number of
folks interested in participating in the upcoming Technical Debt SIG
sessions.  As a result, there's also been a fair amount of discussion on
the  challenges and value of having a large number of participants in a
meeting.

Despite these potentially large numbers, I've decided to move forward with
the sessions.  However, we will be pivoting a bit on the intent of the
meeting.  I've also decided that we will offer up an IRC meeting following
the two Hangout/Bluejeans sessions for those that prefer that platform.

That being said, I think it's important to note that the Technical Debt SIG
is more than a meeting.  The plan is to provide many avenues of engagement
in an attempt to be as inclusive as possible.  What this means is that you
need not worry about "missing out" if you don't attend a SIG session.
You'll have access to the same information through other collaborative
channels such as Wiki pages, newsletters, Google docs, etc...

For the this week's sessions, consider them a Kickoff for the Technical
Debt SIG and general information sharing about the Technical Debt program.
Again, all this information is or will be available via other channels as
well.  We encourage you to participate in a way that suits your style best.

Agenda -

- Purpose of the Tech Debt SIG
- Overview of Tech Debt program
- What to expect moving forward
- Q


Cheers,

JR
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Mapping Hiragana and Katakana

2017-09-19 Thread Trey Jones
We recently got a suggestion via Phabricator[1] to automatically map
between hiragana and katakana when searching on English Wikipedia and other
wiki projects. As an always-on feature, this isn't difficult to implement,
but major commercial search engines (Google.jp, Bing, Yahoo Japan,
DuckDuckGo, Goo) don't do that. They give different results when searching
for hiragana/katakana forms (for example, オオカミ/おおかみ "wolf"). They also give
different *numbers* of results, seeming to indicate that it's not just
re-ordering the same results (say, so that results in the same script are
ranked higher).[2] I want to know what they know that I don't!

Does anyone have any thoughts on whether this would be useful (seems that
it would) and whether it would cause any problems (it must, or otherwise
all the other search engines would do it, right?).

Any idea why it might be different between a Japanese-language wiki and a
non-Japanese-language wiki? We often are more aggressive in matching
between characters that are not native to a given language--for example,
accents on Latin characters are generally ignored on English-language
wikis. So it might make sense to merge hiragana and katakana on
English-language wikis but not Japanese-language wikis.

Thanks very much for any suggestions or information!
—Trey

[1] https://phabricator.wikimedia.org/T176197
[2] Details of my tests at https://phabricator.wikimedia.org/T173650#3580309

Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Brad Jorsch (Anomie)
On Tue, Sep 19, 2017 at 2:41 PM, C. Scott Ananian 
wrote:

> You say "there's not much migration cost moving to PHP7" --
> well, it would be nice to assign someone to go through the details in a
> little more detail to double-check that.


What migration costs there were for PHP7 have probably already been paid,
since as has already been noted several developers are already running on
PHP 7 (myself included). There's nothing much open on the NewPHP
workboard,[1] and what is still open seems to be false positives (T120336,
T173850, T173849, T120694), mostly done already (T153505), deprecations of
stuff we only still have for backwards compatibility (T120333, T143788),
irrelevant (T174199), or tracking backports (T174262).

[1]: https://phabricator.wikimedia.org/project/board/346/


> The HHVM announcement specifically mentioned that they will maintain
> compatibility with composer and phpunit, so that seems to be a wash.
>

They specifically mentioned they'd maintain compatibility with *current
versions* of composer and phpunit *until replacements exist*. No specific
criteria for whether a replacement is good enough have been supplied.

They also imply that they may not support full use of those tools, versus
only features of the tools required for whatever use cases they decide to
support.


> [... much discussion of garbage collection ...] It may be a good
> opportunity to take a hard look at our
> Hooks system and figure out if its design is future-proof.
>

I note our hook system has nothing to do with garbage collection or
destructors. It does rely on references, since that's how PHP handles
output parameters.[1] And in particular, explicit references are needed to
handle output parameters combined with call_user_func_array().

Garbage collection and destructors do make a major difference to the use of
RAII patterns[2] such as ScopedCallback and our database classes.

[1]: https://en.wikipedia.org/wiki/Output_parameter
[2]: https://en.wikipedia.org/wiki/RAII


-- 
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Stas Malyshev
Hi!

> IMO the more interesting discussion to be had is how little we invest into
> the technology our whole platform is based on. You'd think the largest
> production user of PHP would pay at least one part-time PHP developer, or
> try to represent itself in standards and roadmap discussions, but we do
> not. Is that normal?

I think this is a very good point. There are other ways to support PHP
too. Like assisting in regular testing of upcoming versions. Helping
writing the docs. Contributing to RFC discussions (we have a large
codebase, heavily visited site, and a lot of experience dealing with it,
surely there could be a thing or two we could contribute). Triaging the
bugs (one of the most necessary, thankless, and under-appreciated jobs
in an open-source project). Probably more...
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Stas Malyshev
Hi!

> So to be super clear: I'm just pointing out that there used to be issues
> here; sometimes the community's interests do not exactly align.  Consider
> me in the devil's advocate role again: I'd be interested to hear an
> insider's opinion (Stas?) on how security issues are handled these days and
> what the future outlook is,
> https://www.cvedetails.com/vulnerability-list/vendor_id-74/product_id-128/PHP-PHP.html

OK, so it's a big topic but I can do some quick survey and we can get
deeper if you want to. So:

Despite what you see there, there are not a lot of genuine security
issues in PHP. Unfortunately, CVE issuance process is such that it does
not involve any consultation with the vendor about classifying the issue
and assigning severity. You can just create CVE ID for anything and how
they assign severity is kinda mystery to me, but one thing is clear -
PHP core team is not consulted about it, at least not in any way I've
ever noticed. It doesn't mean those are all wrong, but some probably
are, and a lot are misclassified.

Now, for the substance of the issues. Many of them are unserialize()
issues. For this, without getting too much into the woods, I can only
say this: there seems to be no way to make unserialize() robust against
untrusted data, and it has to do with how references, object
construction, object destruction (notice the similarity with what Hack
intends to drop? It is not a coincidence) and serialization support
works in PHP. There is too much internal structure exposed to make it
work with untrusted data now. Maybe if we redesigned the whole thing
from scratch, it _could_ be possible, but even then I kinda doubt it, at
least not without sacrificing some feature support. For now, the best
approach to this is consider using any unserialize() on untrusted data
inherently insecure. It's just too low-level to be sure no corner case
ever does anything strange.

Some of those are genuine security problems, which now mostly
concentrate in several extensions, which has been either under-served or
have very wide exposed areas. Namely, wddx - obscure format that suffers
from issues similar to unserialize() and in addition lack
maintainership, phar - which wasn't really designed to deal with
untrusted scripts but seem to kinda be moving that direction, and gd -
which mostly relies on libgd and every issue there is also automatically
extends into PHP. The latest two would probably benefit from a good
fuzzing testing, but nobody took on it yet. I suspect if hhvm uses the
same or derived code - and it likely does, I don't think they
reimplemented libgd from scratch? - many of those would also be present
in hhvm if hhvm supports these (no idea).

There are also some long-standing debates about how randomness is
handled (basically, there are many ways to get randoms, and most of them
are not suitable for security-related randomness) and some DoS issues in
PHP hash tables - the latter are mostly resolved, but in kinda temporary
way, so there's more work to be done.

Some of these issues - like https://bugs.php.net/bug.php?id=74310 - are
definitely not security issues at all. Yes, you can create bad code that
hits some corner case and produces segfault. That shouldn't happen, but
this being complex C code which is 20 years old, it does. This has
absolutely nothing to do with security, and whoever decided to issue CVE
to it and assign 7.5 severity
(https://www.cvedetails.com/cve/CVE-2017-9119/) has done a very sloppy
job. As I said, this is done with zero communication with actual PHP
team as far as I know, which is very sad, but this is the state of
affairs. Some of those are, I am sorry to say, complete baloney, e.g.
https://www.cvedetails.com/cve/CVE-2017-8923/ says:

<>

This is nonsense - unless you run with no memory limit at all (nobody
sane does that) and specifically allow your code not only accept
infinite untrusted data, but have it fed specifically to certain
functions arranged into a specific code pattern, no remote attacker can
do it. There are many issues that have similar claims, none of them are
actual security issues.

Some are also assigned to PHP despite them being application issues,
e.g. https://www.cvedetails.com/cve/CVE-2017-9067/. To add insult to
injury, this is 2017 CVE about a version that was EOLed in 2014. The
data quality seems to be very sad there. I tried to figure out how to
make it better a while ago but pretty much gave up on it because I
couldn't find anybody responsible or at least concerned about this sad
state of things.

OK, this came out super-long and kinda ranty (sorry!), so I will stop
for now, but if you have any questions about it, please feel free to
ping me, and I will be glad to discuss it.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Kevin Smith
On Tue, Sep 19, 2017 at 11:41 AM, C. Scott Ananian 
wrote:

> Think of this like
> the traditional [[devil's advocate]] process for canonizing a saint.  I'm
> just challenging us to take a hard look at our reasoning and make sure it
> is well-founded technically.
>

​I for one appreciate your engagement here, even if it has not been
popular.

It is easy for a group to make a snap decision, or to end up in an echo
chamber, where decisions seem "obvious" to everyone who is participating.
Yes, sometimes there are easy or obvious decisions. But with the stakes so
high on this one, it seems prudent to think it through, enumerate the
arguments, and invite discussion.

If the decision is really so clearcut, the analysis and discussion should
be quick and painless. Hopefully it can also be civil and respectful,
especially toward those arguing the minority view (whether because they
believe it, or because they are just trying to help reach the best possible
outcome).

For what it's worth, I have added this topic to the TechCom internal
meeting agenda for this week. They can decide whether or not to schedule an
IRC meeting, or to take other steps.

Kevin
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Gergo Tisza
On Tue, Sep 19, 2017 at 9:40 AM, C. Scott Ananian 
wrote:

> Fair enough.  My point is just that we should stop and reflect that this is
> a major inflection point.  Language choices are sticky, so this decision
> will have significant long-term implications.  We should at least stop to
> evaluate PHP7 vs Hack and determine which is a better fit for our codebase,
> and do due diligence on both sides (count how many engineers, how many open
> source contributors, commit rates, etc).  HHVM has been flirting with a
> LLVM backend, and LLVM itself has quite a large and active community.  The
> PHP community has had issues with proper handling of security patches in
> the past.  I'm suggesting to proceed cautiously and have a proper
> discussion of all the factors involved instead of over-simplifying this to
> "community" vs "facebook".
>

PHP is an open-source language with mature tooling and major community
buy-in. Facebook has *promised* to turn Hack into an open-source language
with mature tooling and community buy-in; almost none of that exists
currently. Once it already does, a worthwhile discussion might be had about
switching to Hack. Right now it would be incredibly irresponsible.

Also, making PHP a viable language for third parties is the core business
model of Zend. Making Hack a viable language for third parties has
absolutely nothing to do with the business model of Facebook. At any point
they might decide it is a distraction they don't need. Comparing commit
numbers is not really meaningful without knowing what fraction of those
committers can disappear overnight if Facebook reconsiders its priorities.

IMO the more interesting discussion to be had is how little we invest into
the technology our whole platform is based on. You'd think the largest
production user of PHP would pay at least one part-time PHP developer, or
try to represent itself in standards and roadmap discussions, but we do
not. Is that normal?
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
On Tue, Sep 19, 2017 at 2:41 PM, C. Scott Ananian 
wrote:

> source".  You also mentioned PHP's long history of FLOSS without also
> mentioning their long history at sucking at security.
>

Whoops, I should have toned that down a bit before hitting send.  To be
clear, I'm mostly talking about the ~2007 time frame where there was a lot
of tension between the PHP core team and various folks who wanted to make
PHP more secure in different ways.  I don't actually know what the
present-day status is -- suhosin seems to be still around, but (for
instance) https://sektioneins.de/en/categories/php.html hasn't had any
particular complaints since 2015.

So to be super clear: I'm just pointing out that there used to be issues
here; sometimes the community's interests do not exactly align.  Consider
me in the devil's advocate role again: I'd be interested to hear an
insider's opinion (Stas?) on how security issues are handled these days and
what the future outlook is,
https://www.cvedetails.com/vulnerability-list/vendor_id-74/product_id-128/PHP-PHP.html
doesn't look as nice as
http://www.cvedetails.com/vulnerability-list/vendor_id-7758/product_id-35896/Facebook-Hhvm.html
but
maybe the latter is misleading; older vulnerabilities seem to be at
http://www.cvedetails.com/vulnerability-list/vendor_id-7758/product_id-30684/Facebook-Hiphop-Virtual-Machine.html
for instance.
  --scott

-- 
(http://cscott.net)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?

2017-09-19 Thread Gergo Tisza
On Tue, Sep 19, 2017 at 6:42 AM, Daniel Kinzler  wrote:

> That table will be tall, and the sha1 is the (on average) largest field.
> If we
> are going to use a different mechanism for tracking reverts soon, my hope
> was
> that we can do without it.
>

Can't you just split it into a separate table? Core would only need to
touch it on insert/update, so that should resolve the performance concerns.

Also, since content is supposed to be deduplicated (so two revisions with
the exact same content will have the same content_address), cannot that
replace content_sha1 for revert detection purposes? That wouldn't work over
large periods of time (when the original revision and the revert live in
different kinds of stores) but maybe that's an acceptable compromise.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
Chad: thanks for writing up all of those in one place.  I'm not actually
trying to argue *for* Hack here.  My argument is just that we should take
the time to lay out the arguments carefully on both sides, since this is a
decision that is likely to have long-lasting effects.  Think of this like
the traditional [[devil's advocate]] process for canonizing a saint.  I'm
just challenging us to take a hard look at our reasoning and make sure it
is well-founded technically.

In my ideal world we'd take your excellent point-by-point argument and try
to rigorously quantify each, just to satisfy ourselves that we've done due
diligence.  You say "there's not much migration cost moving to PHP7" --
well, it would be nice to assign someone to go through the details in a
little more detail to double-check that.  "Various benchmarks dhow HHVM and
PHP7 performance to be roughly on par" -- firmer numbers would be nice, as
you say.  I think Tim was already going to do this?  "A bunch of major
projects have already dropped HHVM" -- again, can we take the devil's
advocate position and ask the HHVM team who *hasn't* dropped HHVM ;) and
ask for their reasons?  You also raise some points about packaging and
portability that I could probably counter by pointing out that HHVM has an
ARM and PowerPC JIT now (
http://hhvm.com/blog/2017/03/09/how-the-cyber-elephant-got-his-arm.html),
while Zend's JIT-in-progress only supports x86.  We could also poke at the
HHVM folks and ask about their packaging plans, since they (according to
their announcement) are going to "reinvest in open source".  You also
mentioned PHP's long history of FLOSS without also mentioning their long
history at sucking at security.  And we could try to quantify how dependent
PHP is on Zend contributions vs how dependent HHVM is on facebook
contributions.  We can get numbers on this, we don't have to assume things.

The HHVM announcement specifically mentioned that they will maintain
compatibility with composer and phpunit, so that seems to be a wash.

The references/destructors thing is actually really interesting as a
technical discussion.  The reason given for removing these is all about
performance, in particular the performance hit you get from ref-counting as
opposed to pure GC.  There's been a lot of really good GC work done in the
past two decades; see https://news.ycombinator.com/item?id=11759762
although I like to cite https://www.cs.princeton.edu/~appel/papers/45.ps as
foundational.  It may be a good opportunity to take a hard look at our
Hooks system and figure out if its design is future-proof.
  --scott

On Tue, Sep 19, 2017 at 2:04 PM, Chad  wrote:

> On Tue, Sep 19, 2017 at 10:50 AM C. Scott Ananian 
> wrote:
>
> > Chad: the more you argue that it's not even worth considering, the more
> I'm
> > going to push back and ask for actual numbers and facts. ;)
> >
> >
> As I said in my prior message: it's been considered, and discarded rather
> quickly. It doesn't need much further introspection than that. A couple of
> major points:
>
> * Brian W's point is correct: there's not much migration cost at all moving
> to PHP7. Moving between PHP versions has always been pretty easy for us in
> MW--we write very conservative code as it is.
> * Various (there's lots from lots of people) benchmarks routinely show HHVM
> and PHP7--especially 7.1+--performance to be roughly on par, depending on
> workloads. We can get some firmer numbers here.
> ** A bunch of major projects have already dropped HHVM for PHP7+. Etsy,
> Symphony, Wordpress (they no longer test against it)
> * Swapping to HHVM/Hack would abandon many/most of our downstream users --
> the stats Stas pointed to way far back shows world install base as well
> under 0.5% of all PHP runtimes.
> * Speaking of downstream users: HHVM has always been a second-class citizen
> on non-Linux OSes. It's always been wonky on OSX (I would know, I was the
> first person to ever get it built), and I don't think anyone's ever got it
> working on Windows outside of Cygwin--please correct me if I'm wrong?
> * It greatly complicates setup and development work for both WMF and
> volunteers.
> ** Did I mention almost nobody has up to date packages for this?
> *  PHP has a long-documented history of working as a proper FLOSS project.
> Facebook's track record here has been less than stellar--it comes in fits
> and starts and there's a lot of "throwing code over the wall"
> ** We have a core PHP contributor (Stas) on this list. Other than
> occasional patches we've shipped upstream to HHVM, I doubt anyone would
> claim themselves a core HHVM contributor around here [maybe Tim early on,
> heh]
> * Choices HHVM is going to make upstream (references, destuctors) are a
> /huge/ issue for us. The former is basically a requirement for our Hooks
> system and the latter is used all over the dang place for profiling and
> other fun scope-based tricks.
> * Libraries we use -- composer, 

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Subramanya Sastry

On 09/19/2017 10:34 AM, Bryan Davis wrote:


For what it's worth, my opinion is that PHP is an actual FLOSS
software project with years of history and core contributions from
Zend who make their living with PHP. HHVM is a well funded internal
project from Facebook that has experimented with FLOSS but ultimately
is controlled by the internal needs of Facebook. For me the choice
here is obviously to back the community driven FLOSS project and help
them continue to thrive.


+1.

As a practical choice, given the wide usage of PHP on the web, it makes 
a lot more sense to throw our weight behind Zend development since it is 
more likely to survive than HHVM which is more subject to the whims and 
fancies of FB's internal choices which we are not privy to and cannot 
influence in any meaningful way. I mean how big of a user besides 
Mediawiki serving Wikipedia sites does it have to be for FB to consider 
other user's needs?


Subbu.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Stas Malyshev
Hi!

> Incidentally, how much work has been done on incorporating HHVM's
> improvements back into Zend?

Depends on which ones you're talking about. Syntax ones may or may not
find its way into PHP, but performance ones would probably be completely
different from HHVM - i.e. the resulting performance may or may not be
on par or better, but reusing most of the performance work on HHVM in
PHP would not be possible due to completely different engine internals.

So pretty much all that can be taken from HHVM into PHP would be "this
syntax looks like a good idea, let's reimplement it".
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Paladox
 Zend was testing against hhvm at one point, then decided they would stop 
testing against it. See 
https://www.phoronix.com/scan.php?page=news_item=PTS-PHP-7.1-Benchmarks
On Tuesday, 19 September 2017, 19:24:49 BST, Stas Malyshev 
 wrote:  
 
 Hi!

> I agree the short timeline seems to push us toward reverting to Zend.  But
> it is worth having a meaningful discussion about the long-term outlook.
> Which VM is likely to be better supported in 15 years' time?  Which VM

15 years is a lot to predict. 15 years ago Facebook, Twitter, Reddit and
Linkedin did not exist and Slashdot, Livejournal, etc. were all the
rage. We don't even know if Facebook as such would exist in 15 years or
would have budget to support its own language.

> would we rather adopt and maintain indefinitely ourselves, if needed --
> since in a 15 yr timeframe it's entirely possible that (a) Facebook could
> abandon Hack/HHVM, or (b) the PHP Zend team could implode.  Maintaining

While (b) could happen, PHP project is not very dependent on Zend for
its existence. Zend owns none of the infrastructure or processes, and
while a lot of performance work on PHP 7 was conducted by Zend team (and
they are still working on improvements AFAIK), there are plenty of
community members that do not work for Zend and do not depend on Zend in
any way. Of course, it is possible that the whole community would
implode, but here we have many more stakeholders than in Hack case,
where the stakeholder is mostly a single - albeit large and currently
very successful - company.

> speaking, it's not really a choice between "lock-in" and "no lock in" -- we
> have to choose to align our futures with either Zend Technologies Ltd or
> Facebook.  One of these is *much* better funded than the other.  It is

Again, I do not think this is the right statement to make. The control
of Zend Tech as a company over the future of PHP is much less than
Facebook's one over Hack (which is pretty much absolute). PHP is guided
by the community, decisions are taken by using community processes in
which Zend does not have any special role, and PHP project could survive
reasonably survive without Zend, even if with less resources. Most PHP
infrastructure - Composer, debugging, IDEs, profiling, code quality,
frameworks, etc. - are completely independent of Zend (which also has a
number of tools, but it is not the only provider). So I do not think it
is an adequate comparison.

I am not sure if Hack has an open-source community outside Facebook (if
anybody has pointers to that, please share - commit numbers certainly
don't tell much) - but it is pretty clear to me that Facebook is in
absolute control over this platform. This is not the case with Zend and PHP.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Stas Malyshev
Hi!

> I agree the short timeline seems to push us toward reverting to Zend.  But
> it is worth having a meaningful discussion about the long-term outlook.
> Which VM is likely to be better supported in 15 years' time?  Which VM

15 years is a lot to predict. 15 years ago Facebook, Twitter, Reddit and
Linkedin did not exist and Slashdot, Livejournal, etc. were all the
rage. We don't even know if Facebook as such would exist in 15 years or
would have budget to support its own language.

> would we rather adopt and maintain indefinitely ourselves, if needed --
> since in a 15 yr timeframe it's entirely possible that (a) Facebook could
> abandon Hack/HHVM, or (b) the PHP Zend team could implode.  Maintaining

While (b) could happen, PHP project is not very dependent on Zend for
its existence. Zend owns none of the infrastructure or processes, and
while a lot of performance work on PHP 7 was conducted by Zend team (and
they are still working on improvements AFAIK), there are plenty of
community members that do not work for Zend and do not depend on Zend in
any way. Of course, it is possible that the whole community would
implode, but here we have many more stakeholders than in Hack case,
where the stakeholder is mostly a single - albeit large and currently
very successful - company.

> speaking, it's not really a choice between "lock-in" and "no lock in" -- we
> have to choose to align our futures with either Zend Technologies Ltd or
> Facebook.  One of these is *much* better funded than the other.  It is

Again, I do not think this is the right statement to make. The control
of Zend Tech as a company over the future of PHP is much less than
Facebook's one over Hack (which is pretty much absolute). PHP is guided
by the community, decisions are taken by using community processes in
which Zend does not have any special role, and PHP project could survive
reasonably survive without Zend, even if with less resources. Most PHP
infrastructure - Composer, debugging, IDEs, profiling, code quality,
frameworks, etc. - are completely independent of Zend (which also has a
number of tools, but it is not the only provider). So I do not think it
is an adequate comparison.

I am not sure if Hack has an open-source community outside Facebook (if
anybody has pointers to that, please share - commit numbers certainly
don't tell much) - but it is pretty clear to me that Facebook is in
absolute control over this platform. This is not the case with Zend and PHP.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?

2017-09-19 Thread John Erling Blad
There are two important use cases; one where you want to identify previous
reverts, and one where you want to identify close matches. There are other
ways to do the first than to use a digest, but the digest opens up for
alternate client side algorithms. The last would typically be done by some
locally sensitive hashing. In both cases you don't want to download the
content of each revision, that is exactly why you want to use some kind of
hashes. If the hashes could be requested somehow, perhaps as part of the
API, then it should be sufficient. Those hashes could be part of the XML
dump too, but if you have the XML-dump and know the algorithm, then you
don't need the digest.

There are a specific use case when someone want to verify the content. In
those cases you don't want to identify a previous revert, you want to check
whether someone has tempered with the downloaded content. As you don't know
who might have tempered with the content you should also question the
digest delivered by WMF, thus the digest in the database isn't good enough
as it is right now. Instead of a sha-digest each revision should be
properly signed, but then if you can't trust WMF can you trust their
signature? Signatures for revisions should probably be delivered by some
external entity and not WMF itselves.

On Fri, Sep 15, 2017 at 11:44 PM, Daniel Kinzler <
daniel.kinz...@wikimedia.de> wrote:

> A revert restores a previous revision. It covers all slots.
>
> The fact that reverts, watching, protecting, etc still works per page,
> while you
> can have multiple kinds of different content on the page, is indeed the
> point of
> MCR.
>
> Am 15.09.2017 um 22:23 schrieb C. Scott Ananian:
> > Alternatively, perhaps "hash" could be an optional part of an MCR chunk?
> > We could keep it for the wikitext, but drop the hash for the metadata,
> and
> > drop any support for a "combined" hash over wikitext + all-other-pieces.
> >
> > ...which begs the question about how reverts work in MCR.  Is it just the
> > wikitext which is reverted, or do categories and other metadata revert as
> > well?  And perhaps we can just mark these at revert time instead of
> trying
> > to reconstruct it after the fact?
> >  --scott
> >
> > On Fri, Sep 15, 2017 at 4:13 PM, Stas Malyshev 
> > wrote:
> >
> >> Hi!
> >>
> >> On 9/15/17 1:06 PM, Andrew Otto wrote:
>  As a random idea - would it be possible to calculate the hashes
> >>> when data is transitioned from SQL to Hadoop storage?
> >>>
> >>> We take monthly snapshots of the entire history, so every month we’d
> >>> have to pull the content of every revision ever made :o
> >>
> >> Why? If you already seen that revision in previous snapshot, you'd
> >> already have its hash? Admittedly, I have no idea how the process works,
> >> so I am just talking out of general knowledge and may miss some things.
> >> Also of course you already have hashes from revs till this day and up to
> >> the day we decide to turn the hash off. Starting that day, it'd have to
> >> be generated, but I see no reason to generate one more than once?
> >> --
> >> Stas Malyshev
> >> smalys...@wikimedia.org
> >>
> >> ___
> >> Wikitech-l mailing list
> >> Wikitech-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>
> >
> >
> >
>
>
> --
> Daniel Kinzler
> Principal Platform Engineer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Chad
On Tue, Sep 19, 2017 at 10:50 AM C. Scott Ananian 
wrote:

> Chad: the more you argue that it's not even worth considering, the more I'm
> going to push back and ask for actual numbers and facts. ;)
>
>
As I said in my prior message: it's been considered, and discarded rather
quickly. It doesn't need much further introspection than that. A couple of
major points:

* Brian W's point is correct: there's not much migration cost at all moving
to PHP7. Moving between PHP versions has always been pretty easy for us in
MW--we write very conservative code as it is.
* Various (there's lots from lots of people) benchmarks routinely show HHVM
and PHP7--especially 7.1+--performance to be roughly on par, depending on
workloads. We can get some firmer numbers here.
** A bunch of major projects have already dropped HHVM for PHP7+. Etsy,
Symphony, Wordpress (they no longer test against it)
* Swapping to HHVM/Hack would abandon many/most of our downstream users --
the stats Stas pointed to way far back shows world install base as well
under 0.5% of all PHP runtimes.
* Speaking of downstream users: HHVM has always been a second-class citizen
on non-Linux OSes. It's always been wonky on OSX (I would know, I was the
first person to ever get it built), and I don't think anyone's ever got it
working on Windows outside of Cygwin--please correct me if I'm wrong?
* It greatly complicates setup and development work for both WMF and
volunteers.
** Did I mention almost nobody has up to date packages for this?
*  PHP has a long-documented history of working as a proper FLOSS project.
Facebook's track record here has been less than stellar--it comes in fits
and starts and there's a lot of "throwing code over the wall"
** We have a core PHP contributor (Stas) on this list. Other than
occasional patches we've shipped upstream to HHVM, I doubt anyone would
claim themselves a core HHVM contributor around here [maybe Tim early on,
heh]
* Choices HHVM is going to make upstream (references, destuctors) are a
/huge/ issue for us. The former is basically a requirement for our Hooks
system and the latter is used all over the dang place for profiling and
other fun scope-based tricks.
* Libraries we use -- composer, phpunit, etc can not be depended on to
support Hack in the long term, and there's no tooling in the HHVM world for
this (promises notwithstanding).

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Legoktm
Hi,

On 09/19/2017 10:50 AM, C. Scott Ananian wrote:
> Chad: the more you argue that it's not even worth considering, the more I'm
> going to push back and ask for actual numbers and facts. ;)

Migrating to Hack simply isn't feasible for most users of MediaWiki.
Most distros don't have packages for HHVM, and it's not straightforward
to build AIUI.

Once you've figured that out, then you're going to quickly realize that
you need a cron job to regularly restart HHVM because it'll
OOM/crash/etc. So either you have downtime or multiple HHVM processes.

On the MediaWiki development side, we've been forced for fork all of the
external libraries we depend upon to support Hack. This includes basic
tooling like composer, phpunit, codesniffer, and so on.

There are probably more reasons that could be listed, but I think you
get the idea of why Chad considers it to be a non-starter.

-- Legoktm

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Legoktm
Hi,

On 09/19/2017 10:27 AM, C. Scott Ananian wrote:
> To be super clear: MediaWiki is in *PHP5*.  The choices are:

Well, kind of. MediaWiki has full support for PHP 7 while still
maintaining PHP 5 support.

> 1) MediaWiki will always be in PHP5.
> 2) MediaWiki will eventually migrate to PHP7, or
> 3) MediaWiki will eventually migrate to Hack.
> 
> Is anyone arguing for #1?

I don't think anyone has argued for anything besides #2.

> So we've got two backwards-incompatible choices to make, eventually.

There's already an open RfC[1] about bumping the minimum PHP requirement
to 7.x.

[1] https://phabricator.wikimedia.org/T172165

-- Legoktm

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
Chad: the more you argue that it's not even worth considering, the more I'm
going to push back and ask for actual numbers and facts. ;)

Alex: I admit 8 contributors is not significant, but 24,192 commits vs
104,566 commits is. The conclusion I was suggesting was "same number of
contributors over a much shorter time frame".

Re: github mirror broken: note that http://php.net/build-setup.php mentions
*only* the github mirror.  And the github mirror has been broken for almost
two months now.  There's no mention of the "real" repo.  That merits an
exclamation mark, I think.
  --scott
​
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Brian Wolff
On Tuesday, September 19, 2017, C. Scott Ananian 
wrote:
> On Tue, Sep 19, 2017 at 1:17 PM, Chad  wrote:
>
>> be clear: we never chose HHVM for Hack. We don't use Hack. The one
>> experiment I had at trying Hack never panned out. MediaWiki is in PHP,
not
>> Hack.
>>
>
> To be super clear: MediaWiki is in *PHP5*.  The choices are:
>
> 1) MediaWiki will always be in PHP5.
> 2) MediaWiki will eventually migrate to PHP7, or
> 3) MediaWiki will eventually migrate to Hack.
>
> Is anyone arguing for #1?
>
> So we've got two backwards-incompatible choices to make, eventually.
>  --scott
>
> --
> (http://cscott.net)
>

No, MediaWiki is written in a subset of php5 which is forwards compatible
with php7. The backwards incompatible things are by in large things no sane
code would rely on (http://php.net/manual/en/migration70.incompatible.php)
There are people who are using mediawiki with php7 right now without
complaint. Presumably eventually we will move to needing php7, just like
once we moved from php4 to php5, but thats really just a version
requirement increment.


Sure there may be backwards incompatible things that come up, but thats
nothing new. E.g. We used to have a class named Namespace. Then php 5.3
came along and its now MWNamespace. All signs point to php7
incompatibilities to be of this form.
--
Brian
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Alex Monk
On 19 Sep 2017 5:40 pm, "C. Scott Ananian"  wrote:

I'm suggesting to proceed cautiously and have a proper
discussion of all the factors involved instead of over-simplifying this to
"community" vs "facebook".

For example, the top-line github stats are:
hhvm: 504 contributors (24,192 commits)
php-src: 496 contributors (104,566 commits)

HHVM seems to have a larger community of contributors despite a much
shorter active life.

By a difference of 8 contributors?

But note that the PHP github mirror has been broken since Jul 29 (!).

I'm not convinced an exclamation mark in brackets is required here.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Chad
On Tue, Sep 19, 2017 at 10:28 AM C. Scott Ananian 
wrote:

> On Tue, Sep 19, 2017 at 1:17 PM, Chad  wrote:
>
> > be clear: we never chose HHVM for Hack. We don't use Hack. The one
> > experiment I had at trying Hack never panned out. MediaWiki is in PHP,
> not
> > Hack.
> >
>
> To be super clear: MediaWiki is in *PHP5*.  The choices are:
>
> 1) MediaWiki will always be in PHP5.
> 2) MediaWiki will eventually migrate to PHP7, or
> 3) MediaWiki will eventually migrate to Hack.
>
> Is anyone arguing for #1?
>
> So we've got two backwards-incompatible choices to make, eventually.
>

I see everyone saying #1 for now and #2 down the road. You're the only one
here who seems to think #3 is even worth considering. The rest of us
have--rightly--dismissed it already.

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
On Tue, Sep 19, 2017 at 1:17 PM, Chad  wrote:

> be clear: we never chose HHVM for Hack. We don't use Hack. The one
> experiment I had at trying Hack never panned out. MediaWiki is in PHP, not
> Hack.
>

To be super clear: MediaWiki is in *PHP5*.  The choices are:

1) MediaWiki will always be in PHP5.
2) MediaWiki will eventually migrate to PHP7, or
3) MediaWiki will eventually migrate to Hack.

Is anyone arguing for #1?

So we've got two backwards-incompatible choices to make, eventually.
 --scott

-- 
(http://cscott.net)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Chad
On Tue, Sep 19, 2017 at 9:40 AM C. Scott Ananian 
wrote:

> Fair enough.  My point is just that we should stop and reflect that this is
> a major inflection point.  Language choices are sticky, so this decision
> will have significant long-term implications.  We should at least stop to
> evaluate PHP7 vs Hack and determine which is a better fit for our codebase,
> and do due diligence on both sides (count how many engineers, how many open
> source contributors, commit rates, etc).  HHVM has been flirting with a
> LLVM backend, and LLVM itself has quite a large and active community.  The
> PHP community has had issues with proper handling of security patches in
> the past.  I'm suggesting to proceed cautiously and have a proper
> discussion of all the factors involved instead of over-simplifying this to
> "community" vs "facebook".
>
>
I'm not trying to simplify this into community vs. facebook. And let's also
be clear: we never chose HHVM for Hack. We don't use Hack. The one
experiment I had at trying Hack never panned out. MediaWiki is in PHP, not
Hack.

The *only* reason we're having a language discussion is because HHVM has
announced that they're abandoning PHP in favor of Hack. If someone had some
to the list last week and said "Hey let's abandon PHP for XYZLang" they
would've been rightly laughed off.

The debate here is between runtimes for PHP, and on the long enough
timescale there's only one option. PHP has a long-standing history of being
a viable runtime. HHVM does not.

I don't see this as an A/B choice at all, but rather a clear path forward.
So sure: let's have an RfC/TechComm meeting to work out the details, but
let's not pretend that option #2 is even remotely viable. It is not.

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
On Tue, Sep 19, 2017 at 11:34 AM, Bryan Davis  wrote:

> On Tue, Sep 19, 2017 at 9:21 AM, C. Scott Ananian
>  wrote:
> > There are other big users of HHVM -- do we know what other members of the
> > larger community are doing?  We've heard that Phabricator intends to
> follow
> > PHP 7.  Etsy also shifted to HHVM, do we know what their plans are?
>
> Etsy 'experimented' with HHVM [0] and then eventually switched to PHP7
> as their primary runtime. The blog posts about this are a little
> scattered, but Rasmus spoke about it [1] and Etsy started the phan
> project [2].
>

I got confirmation on twitter:
https://twitter.com/jazzdan/status/910162545805336576


> For what it's worth, my opinion is that PHP is an actual FLOSS
> software project with years of history and core contributions from
> Zend who make their living with PHP. HHVM is a well funded internal
> project from Facebook that has experimented with FLOSS but ultimately
> is controlled by the internal needs of Facebook. For me the choice
> here is obviously to back the community driven FLOSS project and help
> them continue to thrive.
>

Fair enough.  My point is just that we should stop and reflect that this is
a major inflection point.  Language choices are sticky, so this decision
will have significant long-term implications.  We should at least stop to
evaluate PHP7 vs Hack and determine which is a better fit for our codebase,
and do due diligence on both sides (count how many engineers, how many open
source contributors, commit rates, etc).  HHVM has been flirting with a
LLVM backend, and LLVM itself has quite a large and active community.  The
PHP community has had issues with proper handling of security patches in
the past.  I'm suggesting to proceed cautiously and have a proper
discussion of all the factors involved instead of over-simplifying this to
"community" vs "facebook".

For example, the top-line github stats are:
hhvm: 504 contributors (24,192 commits)
php-src: 496 contributors (104,566 commits)

HHVM seems to have a larger community of contributors despite a much
shorter active life.  But note that the PHP github mirror has been broken
since Jul 29 (!).  In the past 6 days I count 8 distinct contributors to
php-src, and 10 distinct contributors in the past *two days* to hhvm (one
of whom contributed an OCAML frontend(!)).  These are just hand-wavy
figures; ideally we should try to determine how many of the recent
contributors to each project are employed by Facebook and/or Zend.

I think there's room for a reasonable debate.
 --scott
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Victoria Coleman

> On Sep 19, 2017, at 6:56 AM, Gilles Dubuc  wrote:
> 
> Should we have a TechComm-driven meeting about this ASAP?
> 
> Like others, I don't expect that there will be disagreement about the way
> to go, but there is a lot to discuss about what needs to be done,
> resourcing, etc.
> 
> It would be nice to have Ori around for it, to pick his brains about any
> undocumented or little-known knowledge about the HHVM migration that could
> bite us when migrating to PHP 7.x if we don't know about it.
> 
> On Tue, Sep 19, 2017 at 9:07 AM, Moritz Muehlenhoff <
> mmuhlenh...@wikimedia.org> wrote:
> 
>> On Tue, Sep 19, 2017 at 10:13:47AM +1000, Tim Starling wrote:
>>> On 19/09/17 06:58, Max Semenik wrote:
 Today, the HHVM developers made an announcement[1] that they have
>> plans of
 ceasing to maintain 100% PHP7 compatibility and concentrating on Hack
 instead.
>>> 
>>> The HHVM team did tell us privately that they were planning on
>>> changing their strategy, basically as you describe it above. The
>>> surprising things for me in this announcement were:
>>> 
>>> * The plan to also drop PHP 5 compatibility, on a short timeline (1
>> year).
>>> * Rather than "drifting away" from PHP, their top priority plans
>>> include removing core language features like references and destructors.
>>> 
 While this does not mean that we need to take an action immediately,
 eventually we will have to decide something.
>>> 
>>> Actually, I think a year is a pretty short time for ops to switch to
>>> PHP 7. I think we need to decide on this pretty much immediately.
>> 
>> The next step would be the upgrade of the mw* fleet to Debian stretch
>> while still using HHVM 3.18 (to minimise disruption since we've stabilised
>> 3.18 in it's current build). That work is tracked at T174431. 3.18 will
>> be supported by upstream for at least another six months (and if the
>> migration drags
>> further I can roll custom 3.18 security backports from later LTS releases)
>> 
>> Debian stretch ships PHP7, so that'd be a good stepstone to migrate
>> back to Zend.
>> 
>> Cheers,
>>Moritz
>> 
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
On Tue, Sep 19, 2017 at 11:33 AM, bawolff  wrote:

> I disagree. I don't think its useful or possible to try to forecast
> technical trends 15 years out. 15 years from now, it is just as likely
> that facebook will be as relevant as myspace is today, as it is that
> facebook will go full cyberpunk dystopia on us and rule the world. I
> don't think we can realistically predict anything 15 years out.
>
>
That's why I argued we should consider the code quality of both, and
determine which we'd feel most comfortable supporting on our own in the
case our crystal balls are wrong and we end up holding the bag for the
runtime.  (This is just one factor to consider; my argument is that we
should carefully enumerate the various factors before making an evaluation,
not just dismiss any of the options out of hand.)
 --scott

-- 
(http://cscott.net)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Bryan Davis
On Tue, Sep 19, 2017 at 9:21 AM, C. Scott Ananian
 wrote:
> There are other big users of HHVM -- do we know what other members of the
> larger community are doing?  We've heard that Phabricator intends to follow
> PHP 7.  Etsy also shifted to HHVM, do we know what their plans are?

Etsy 'experimented' with HHVM [0] and then eventually switched to PHP7
as their primary runtime. The blog posts about this are a little
scattered, but Rasmus spoke about it [1] and Etsy started the phan
project [2].

For what it's worth, my opinion is that PHP is an actual FLOSS
software project with years of history and core contributions from
Zend who make their living with PHP. HHVM is a well funded internal
project from Facebook that has experimented with FLOSS but ultimately
is controlled by the internal needs of Facebook. For me the choice
here is obviously to back the community driven FLOSS project and help
them continue to thrive.

[0]: https://codeascraft.com/2015/04/06/experimenting-with-hhvm-at-etsy/
[1]: https://codeascraft.com/speakers/rasmus-lerdorf-deploying-php-7/
[2]: https://github.com/phan/phan

Bryan
-- 
Bryan Davis  Wikimedia Foundation
[[m:User:BDavis_(WMF)]] Manager, Cloud Services  Boise, ID USA
irc: bd808v:415.839.6885 x6855

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Cormac Parle
*Continue* to have the best performance? A quick search for benchmarks suggests 
that 7.1 outperforms HHVM more often than not.  

Betting our future on Facebook because they have most money right now seems 
unwise. Which of the two has the larger developer community?


> On 19 Sep 2017, at 16:21, C. Scott Ananian  wrote:
> 
> On Mon, Sep 18, 2017 at 9:51 PM, Chad  wrote:
> 
>> I see zero reason for us to go through all the formalities, unless we want
>> to really. I have yet to see anyone (on list, or on IRC anywhere at all
>> today) where anyone suggested (2) was a good idea at all. It's a
>> horrifically bad idea.
>> 
> 
> Technically, I did outline the arguments for (2), earlier on this thread.
> It was a bit allegorical, though:
> https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2018/Writing_Tips/Examples#Example_of_a_.22medium_level.22_position_statement
> This should be seriously considered, not just dismissed.
> 
> I agree the short timeline seems to push us toward reverting to Zend.  But
> it is worth having a meaningful discussion about the long-term outlook.
> Which VM is likely to be better supported in 15 years' time?  Which VM
> would we rather adopt and maintain indefinitely ourselves, if needed --
> since in a 15 yr timeframe it's entirely possible that (a) Facebook could
> abandon Hack/HHVM, or (b) the PHP Zend team could implode.  Maintaining
> control over our core runtime is important; I think we should at least
> discuss long-term contingencies if either goes down. Obviously, our future
> was most stable when we (briefly!) had a choice between two strong
> runtimes... but that opportunity seems to be vanishing.  Practically
> speaking, it's not really a choice between "lock-in" and "no lock in" -- we
> have to choose to align our futures with either Zend Technologies Ltd or
> Facebook.  One of these is *much* better funded than the other.  It is
> likely that the project with the most funding will continue to have the
> better performance.
> 
> There are other big users of HHVM -- do we know what other members of the
> larger community are doing?  We've heard that Phabricator intends to follow
> PHP 7.  Etsy also shifted to HHVM, do we know what their plans are?
>  --scott
> 
> -- 
> (http://cscott.net)
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread bawolff
On Tue, Sep 19, 2017 at 3:21 PM, C. Scott Ananian
 wrote:
> On Mon, Sep 18, 2017 at 9:51 PM, Chad  wrote:
>
>> I see zero reason for us to go through all the formalities, unless we want
>> to really. I have yet to see anyone (on list, or on IRC anywhere at all
>> today) where anyone suggested (2) was a good idea at all. It's a
>> horrifically bad idea.
>>
>
> Technically, I did outline the arguments for (2), earlier on this thread.
> It was a bit allegorical, though:
> https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2018/Writing_Tips/Examples#Example_of_a_.22medium_level.22_position_statement
> This should be seriously considered, not just dismissed.
>
> I agree the short timeline seems to push us toward reverting to Zend.  But
> it is worth having a meaningful discussion about the long-term outlook.
> Which VM is likely to be better supported in 15 years' time?  Which VM
> would we rather adopt and maintain indefinitely ourselves, if needed --
> since in a 15 yr timeframe it's entirely possible that (a) Facebook could
> abandon Hack/HHVM, or (b) the PHP Zend team could implode.  Maintaining
> control over our core runtime is important; I think we should at least
> discuss long-term contingencies if either goes down. Obviously, our future
> was most stable when we (briefly!) had a choice between two strong
> runtimes... but that opportunity seems to be vanishing.  Practically
> speaking, it's not really a choice between "lock-in" and "no lock in" -- we
> have to choose to align our futures with either Zend Technologies Ltd or
> Facebook.  One of these is *much* better funded than the other.  It is
> likely that the project with the most funding will continue to have the
> better performance.
>
> There are other big users of HHVM -- do we know what other members of the
> larger community are doing?  We've heard that Phabricator intends to follow
> PHP 7.  Etsy also shifted to HHVM, do we know what their plans are?
>   --scott
>
> --
> (http://cscott.net)
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

I disagree. I don't think its useful or possible to try to forecast
technical trends 15 years out. 15 years from now, it is just as likely
that facebook will be as relevant as myspace is today, as it is that
facebook will go full cyberpunk dystopia on us and rule the world. I
don't think we can realistically predict anything 15 years out.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread James Hare

> On Sep 19, 2017, at 8:21 AM, C. Scott Ananian  wrote:
> 
>> On Mon, Sep 18, 2017 at 9:51 PM, Chad  wrote:
>> 
>> I see zero reason for us to go through all the formalities, unless we want
>> to really. I have yet to see anyone (on list, or on IRC anywhere at all
>> today) where anyone suggested (2) was a good idea at all. It's a
>> horrifically bad idea.
>> 
> 
> Technically, I did outline the arguments for (2), earlier on this thread.
> It was a bit allegorical, though:
> https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2018/Writing_Tips/Examples#Example_of_a_.22medium_level.22_position_statement
> This should be seriously considered, not just dismissed.
> 
> I agree the short timeline seems to push us toward reverting to Zend.  But
> it is worth having a meaningful discussion about the long-term outlook.
> Which VM is likely to be better supported in 15 years' time?  Which VM
> would we rather adopt and maintain indefinitely ourselves, if needed --
> since in a 15 yr timeframe it's entirely possible that (a) Facebook could
> abandon Hack/HHVM, or (b) the PHP Zend team could implode.  Maintaining
> control over our core runtime is important; I think we should at least
> discuss long-term contingencies if either goes down. Obviously, our future
> was most stable when we (briefly!) had a choice between two strong
> runtimes... but that opportunity seems to be vanishing.  Practically
> speaking, it's not really a choice between "lock-in" and "no lock in" -- we
> have to choose to align our futures with either Zend Technologies Ltd or
> Facebook.  One of these is *much* better funded than the other.  It is
> likely that the project with the most funding will continue to have the
> better performance.
> 
> There are other big users of HHVM -- do we know what other members of the
> larger community are doing?  We've heard that Phabricator intends to follow
> PHP 7.  Etsy also shifted to HHVM, do we know what their plans are?
>  --scott

This is what I want to know as well: who is the community of HHVM users outside 
of us? What are their own internal reactions? Keep in mind that businesses like 
Etsy don't have to support third party downstream users as we do; "I guess 
we'll use Hack now" is a more plausible response than it would be for us.

But if others are indeed like us and are balking at the plan to drop PHP, it 
would be worth looking into how we can pool resources to "save" HHVM, or create 
a runtime that maintains the performance improvements of HHVM while maintaining 
PHP support.

Incidentally, how much work has been done on incorporating HHVM's improvements 
back into Zend?

> 
> -- 
> (http://cscott.net)
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread C. Scott Ananian
On Mon, Sep 18, 2017 at 9:51 PM, Chad  wrote:

> I see zero reason for us to go through all the formalities, unless we want
> to really. I have yet to see anyone (on list, or on IRC anywhere at all
> today) where anyone suggested (2) was a good idea at all. It's a
> horrifically bad idea.
>

Technically, I did outline the arguments for (2), earlier on this thread.
It was a bit allegorical, though:
https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2018/Writing_Tips/Examples#Example_of_a_.22medium_level.22_position_statement
This should be seriously considered, not just dismissed.

I agree the short timeline seems to push us toward reverting to Zend.  But
it is worth having a meaningful discussion about the long-term outlook.
Which VM is likely to be better supported in 15 years' time?  Which VM
would we rather adopt and maintain indefinitely ourselves, if needed --
since in a 15 yr timeframe it's entirely possible that (a) Facebook could
abandon Hack/HHVM, or (b) the PHP Zend team could implode.  Maintaining
control over our core runtime is important; I think we should at least
discuss long-term contingencies if either goes down. Obviously, our future
was most stable when we (briefly!) had a choice between two strong
runtimes... but that opportunity seems to be vanishing.  Practically
speaking, it's not really a choice between "lock-in" and "no lock in" -- we
have to choose to align our futures with either Zend Technologies Ltd or
Facebook.  One of these is *much* better funded than the other.  It is
likely that the project with the most funding will continue to have the
better performance.

There are other big users of HHVM -- do we know what other members of the
larger community are doing?  We've heard that Phabricator intends to follow
PHP 7.  Etsy also shifted to HHVM, do we know what their plans are?
  --scott

-- 
(http://cscott.net)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?

2017-09-19 Thread Daniel Kinzler
Am 19.09.2017 um 10:15 schrieb Jaime Crespo:
> I am not a mediawiki developer, but shouldn't sha1 be moved instead of
> deleted/not deleted? Moved to the content table- so it is kept
> unaltered.
The background of my original mail is indede the question whether we need the
sha1 field in the content table. The current draft of the DB schema  includes 
it.

That table will be tall, and the sha1 is the (on average) largest field. If we
are going to use a different mechanism for tracking reverts soon, my hope was
that we can do without it.

OIn any case, my impression is that if we want to keep using hashes to detect
reverts, we need to keep rev_sha1 - and to maintain is, we ALSO need 
content_sha1.

-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Is it possible to edit scribunto data module content through template edit popup of visual editor?

2017-09-19 Thread Brad Jorsch (Anomie)
On Tue, Sep 19, 2017 at 2:48 AM, mathieu stumpf guntz <
psychosl...@culture-libre.org> wrote:

But having ability to write a limited amount of bytes in a single data
> module per script call, and possibly others safeguard limits, wouldn't be
> that risky, would it?
>

It would break T67258 . I also
think it's probably a very bad idea to be trying to have page parses make
edits to the wiki.


> If it's not, please provide me some feed back on the proposal to add such
> a function, and if I should document such a proposal elsewhere, please let
> me know.
>

You're free to file a task in Phabricator, but it will be closed as
Declined. There are too many potential issues there for far too little
benefit.

Your initial idea of somehow hooking into the editor (whether that's the
wikitext editor or VE) with JavaScript to allow humans to make edits to the
data module while editing another page was much better.


-- 
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Victoria Coleman
Sounds like a good idea. +Daniel for scheduling.

Best,

Victoria


> On Sep 19, 2017, at 6:56 AM, Gilles Dubuc  wrote:
> 
> Should we have a TechComm-driven meeting about this ASAP?
> 
> Like others, I don't expect that there will be disagreement about the way
> to go, but there is a lot to discuss about what needs to be done,
> resourcing, etc.
> 
> It would be nice to have Ori around for it, to pick his brains about any
> undocumented or little-known knowledge about the HHVM migration that could
> bite us when migrating to PHP 7.x if we don't know about it.
> 
> On Tue, Sep 19, 2017 at 9:07 AM, Moritz Muehlenhoff <
> mmuhlenh...@wikimedia.org> wrote:
> 
>> On Tue, Sep 19, 2017 at 10:13:47AM +1000, Tim Starling wrote:
>>> On 19/09/17 06:58, Max Semenik wrote:
 Today, the HHVM developers made an announcement[1] that they have
>> plans of
 ceasing to maintain 100% PHP7 compatibility and concentrating on Hack
 instead.
>>> 
>>> The HHVM team did tell us privately that they were planning on
>>> changing their strategy, basically as you describe it above. The
>>> surprising things for me in this announcement were:
>>> 
>>> * The plan to also drop PHP 5 compatibility, on a short timeline (1
>> year).
>>> * Rather than "drifting away" from PHP, their top priority plans
>>> include removing core language features like references and destructors.
>>> 
 While this does not mean that we need to take an action immediately,
 eventually we will have to decide something.
>>> 
>>> Actually, I think a year is a pretty short time for ops to switch to
>>> PHP 7. I think we need to decide on this pretty much immediately.
>> 
>> The next step would be the upgrade of the mw* fleet to Debian stretch
>> while still using HHVM 3.18 (to minimise disruption since we've stabilised
>> 3.18 in it's current build). That work is tracked at T174431. 3.18 will
>> be supported by upstream for at least another six months (and if the
>> migration drags
>> further I can roll custom 3.18 security backports from later LTS releases)
>> 
>> Debian stretch ships PHP7, so that'd be a good stepstone to migrate
>> back to Zend.
>> 
>> Cheers,
>>Moritz
>> 
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Tomorrow: Weekly Technical Advice IRC Meeting

2017-09-19 Thread Michael Schönitzer
Sorry for cross-posting!

Reminder: Technical Advice IRC meeting again **tomorrow 3-4 pm UTC** on
#wikimedia-tech.

The Technical Advice IRC meeting is open for all volunteer developers,
topics and questions. This can be anything from "how to get started" over
"who would be the best contact for X" to specific questions on your project.

If you know already what you would like to discuss or ask, please add your
topic to the next meeting: https://www.mediawiki.org/wiki/Technical_
Advice_IRC_Meeting

This meeting is an offer by WMDE’s tech team. Hosts of todays meeting are:
@addshore & @Tobi_WMDE_SW.


Hope to see you there!
Michi (for WMDE’s tech team)

-- 
Michael F. Schönitzer



Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
http://wikimedia.de

Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen
Wissens frei teilhaben kann. Helfen Sie uns dabei!
http://spenden.wikimedia.de/

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Gilles Dubuc
Should we have a TechComm-driven meeting about this ASAP?

Like others, I don't expect that there will be disagreement about the way
to go, but there is a lot to discuss about what needs to be done,
resourcing, etc.

It would be nice to have Ori around for it, to pick his brains about any
undocumented or little-known knowledge about the HHVM migration that could
bite us when migrating to PHP 7.x if we don't know about it.

On Tue, Sep 19, 2017 at 9:07 AM, Moritz Muehlenhoff <
mmuhlenh...@wikimedia.org> wrote:

> On Tue, Sep 19, 2017 at 10:13:47AM +1000, Tim Starling wrote:
> > On 19/09/17 06:58, Max Semenik wrote:
> > > Today, the HHVM developers made an announcement[1] that they have
> plans of
> > > ceasing to maintain 100% PHP7 compatibility and concentrating on Hack
> > > instead.
> >
> > The HHVM team did tell us privately that they were planning on
> > changing their strategy, basically as you describe it above. The
> > surprising things for me in this announcement were:
> >
> > * The plan to also drop PHP 5 compatibility, on a short timeline (1
> year).
> > * Rather than "drifting away" from PHP, their top priority plans
> > include removing core language features like references and destructors.
> >
> > > While this does not mean that we need to take an action immediately,
> > > eventually we will have to decide something.
> >
> > Actually, I think a year is a pretty short time for ops to switch to
> > PHP 7. I think we need to decide on this pretty much immediately.
>
> The next step would be the upgrade of the mw* fleet to Debian stretch
> while still using HHVM 3.18 (to minimise disruption since we've stabilised
> 3.18 in it's current build). That work is tracked at T174431. 3.18 will
> be supported by upstream for at least another six months (and if the
> migration drags
> further I can roll custom 3.18 security backports from later LTS releases)
>
> Debian stretch ships PHP7, so that'd be a good stepstone to migrate
> back to Zend.
>
> Cheers,
> Moritz
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Wikidata revisions mail

2017-09-19 Thread יגאל חיטרון
Hello.
Since yesterday, I started to get a lot of letters about unexisting
revisions in ruwiki articles. I did not changed something relevant in
preferences, I think. A little investigation gave me the cause - these are
edits of items that are connected by the the item of my watchlist articles
in Wikidata. I know this is a probem in special:watchlist, this is why I do
not turn on the wikidata in watchlist. But it's the first time I can see
them in emails. How can I turn off these, and only these letters, please? I
couldn't find something fit in preferences. At least, good it's on ruwiki
only, I almost do not work there, have about a dozen of pages in watchlist,
so I get these mails once an hour. If it would happen in my home wiki with
thousands of pages in watchlist, it can be a letter every second. What can
I do?
Thank you,
Igal (User:Ikhitron)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?

2017-09-19 Thread Jaime Crespo
I am not a mediawiki developer, but shouldn't sha1 be moved instead of
deleted/not deleted? Moved to the content table- so it is kept
unaltered.

That way it can be used for all the goals that have been discussed
(detect reversions, XML dumps, etc.) and they are not altered, just
moved away (being more compatible). And it is not like structure
compatibility is going to be kept, as many fields are going to be
"moved" there, so code using the tables directly has to change anyway;
but if the actual content is not altered, the sha field can be kept
unaltered with the same value as before. It would also allow to detect
a "partial revertion", that means, mediawiki text is set to the same
than a previous one, which is what I assume it is used now. However,
now there will be other content that can be reverted individually.

I do not know what exactly MCR is going to be used for, but if (silly
idea), main text article and categories are 2 different contents of an
article, if user A edits both, and user B reverts the text only, that
would get a different revision sha1 value; however, most reasons here
would want to detect the reversion by checking the sha of the text
only (aka content). Equally, for backwards compatibility, storing it
on content would allow to not have to recalculate it for all already
existing values literally reducing it to a "trivial" code change,
while keeping all old data valid. Keeping the field as is, on
revision, will mean all historical data and old dumps are invalid.
Full revision reversions, if needed, can be checked by checking each
individual content sha or the linked content ids.

If, on the other side, revision should be kept completely backwards
compatible, some helper views can be created on the cloud
wikireplicas, but other than that, MCR would not be possible.

If at a later time, text with the same hash is detected (and content
double checked), content could be normalized by assigning the same id
to the same content?

On Mon, Sep 18, 2017 at 8:25 PM, Danny B.  wrote:
>
> -- Původní e-mail --
> Od: Dan Andreescu 
> Komu: Wikimedia developers 
> Datum: 18. 9. 2017 16:26:18
> Předmět: Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?
> "So, as things stand, rev_sha1 in the database is used for:
>
> 1. the XML dumps process and all the researchers depending on the XML dumps
> (probably just for revert detection)
> 2. revert detection for libraries like python-mwreverts [1]
> 3. revert detection in mediawiki history reconstruction processes in Hadoop
> (Wikistats 2.0)
> 4. revert detection in Wikistats 1.0
> 5. revert detection for tools that run on labs, like Wikimetrics
> ?. I think Aaron also uses rev_sha1 in ORES, but I can't seem to find the
> latest code for that service
>
> If you think about this list above as a flow of data, you'll see that
> rev_sha1 is replicated to xml, labs databases, hadoop, ML models, etc. So
> removing it and adding it back downstream from the main mediawiki database
> somewhere, like in XML, cuts off the other places that need it. That means
> it must be available either in the mediawiki database or in some other
> central database which all those other consumers can pull from.
> "
>
>
>
> I use rev_sha1 on replicas to check the consistency of modules, templates or
> other pages (typically help) which should be same between projects (either
> within one language or even crosslanguage, if the page is not language
> dependent). In other words to detect possible changes in them and syncing
> them.
>
>
>
>
> Also, I haven't noticed it mentioned in the thread: Flow also notices users
> on reverts, but IDK whether it uses rev_sha1 or not. So I'm rather
> mentioning it.
>
>
>
>
>
>
>
> Kind regards
>
>
>
>
>
>
>
> Danny B.
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
Jaime Crespo


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HHVM vs. Zend divergence

2017-09-19 Thread Moritz Muehlenhoff
On Tue, Sep 19, 2017 at 10:13:47AM +1000, Tim Starling wrote:
> On 19/09/17 06:58, Max Semenik wrote:
> > Today, the HHVM developers made an announcement[1] that they have plans of
> > ceasing to maintain 100% PHP7 compatibility and concentrating on Hack
> > instead.
> 
> The HHVM team did tell us privately that they were planning on
> changing their strategy, basically as you describe it above. The
> surprising things for me in this announcement were:
> 
> * The plan to also drop PHP 5 compatibility, on a short timeline (1 year).
> * Rather than "drifting away" from PHP, their top priority plans
> include removing core language features like references and destructors.
> 
> > While this does not mean that we need to take an action immediately,
> > eventually we will have to decide something. 
> 
> Actually, I think a year is a pretty short time for ops to switch to
> PHP 7. I think we need to decide on this pretty much immediately.

The next step would be the upgrade of the mw* fleet to Debian stretch
while still using HHVM 3.18 (to minimise disruption since we've stabilised
3.18 in it's current build). That work is tracked at T174431. 3.18 will
be supported by upstream for at least another six months (and if the migration 
drags
further I can roll custom 3.18 security backports from later LTS releases)

Debian stretch ships PHP7, so that'd be a good stepstone to migrate
back to Zend.

Cheers,
Moritz

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Is it possible to edit scribunto data module content through template edit popup of visual editor?

2017-09-19 Thread mathieu stumpf guntz



Le 19/09/2017 à 00:38, mathieu stumpf guntz a écrit :
Well, I have investigated a bit, and so far the only reference where I 
found a description of saving/retrieving data from a Scribunto module 
is SemanticScribunto.
Hum, it was a bit late when I wrote that, so of course *load* data from 
a Scribunto module is no big deal. The real issue which is on my way is 
the ability to *save* data. As far as I know, a module has currently no 
way to save data. No doubt it's a good thing that they can't make 
arbitrary change any page on the wiki. But having ability to write a 
limited amount of bytes in a single data module per script call, and 
possibly others safeguard limits, wouldn't be that risky, would it?


So if this is something already possible, then please let me know.

If it's not, please provide me some feed back on the proposal to add 
such a function, and if I should document such a proposal elsewhere, 
please let me know.


Kind regards,
mathieu




https://github.com/SemanticMediaWiki/SemanticScribunto
https://upload.wikimedia.org/wikipedia/mediawiki/7/75/EMWCon_Spring_2017_-_Introducing_SemanticScribunto_Extension.pdf 



It's build on Semantic Mediawiki, well, I don't know much about that. 
All that seems interesting, although a far more large than what I was 
looking for, at least to begin with. What I would like is a way to 
quickly prototype and drop some trials. This extension seems fine but 
it already requires to install extension, so it would be more 
difficult to put that on an existing wiki so I can have feedback on 
prototypes. Maybe the least resistance path would be to install a 
mediawiki instance on toolforge…



Le 17/09/2017 à 14:05, mathieu stumpf guntz a écrit :

Saluton ĉiuj kundisvolvantoj,

I think the subject summarize it all, so here are more details on 
what I'm trying to do and what I'm looking for.


# Context

You might skip this section if you are not interested in contextual 
verbiage. If you would like to react to anything stated in this 
section, please change the email subject to reflect that.


So I'm currently meditating ways to improve factorization of 
knowledge stored in Wiktionary.


I'm taking a multi-approach experimentation there. On the one hand, I 
just began a Wikiversity project 
 
(in French) to establish a specification of how a DBMS should be 
structured to be useful for Wiktionaries. It mainly emerged from my 
point of view that the current data model 
 
proposed for the wikidata for wiktionary 
 does not fit 
needs of Wiktionary contributors. I did made some alternative 
proposals 
, 
and tried to gather a first feedback from the French wiktionary 
 
on this model too, which led me to the creation Wikiversity research 
project because I was pointed to the lack of "specify extensively the 
needs before you model".


Now, on an other hand, I'm also trying to factorize some data within 
the Wikitionary with current available tools. One driving topic for 
that is fixing gender gap 
, 
and more broadly inflection-form gap. That is a feminine form will 
generally be summarized in a laconic "feminine form of *some-term*", 
rather than being treated as an entry of it's own. That's all the 
more problematic in cases where a word only share a subset of 
relevant definitions depending on which gender(/inflection-form) it 
applies to.


# What I'm trying to do

I am trying to factorize data which pertains to several 
inflection-forms. This way each form can use it to build a 
stand-alone article about a term. The current approach tends to be 
gathering everything under a single lemme, although some statements 
will only pertains to some specific forms.


So far I experimented with transclusion of subpages to share 
definitions, examples and so on between inflection-forms. Well, from 
a consultation point of view it works. But from an editing point of 
view, it's all but fine.


What I would think interesting, is to store this data in a scribunto 
data module (at least for now), and enable user to change them while 
editing an lexical entry article. That might be, when using the 
visual editor, through something like a model popup. Wikitext editors 
will probably be skilled enough to edit the relevant module, but for 
the sake of convenience, it might be interesting to allow to give a 
parameter to the model, which would at publishing time modify the 
data module and remove the parameter from the