Re: [Wikimedia-l] [Wikimedia Announcements] June 2017 agenda of the Board of Trustees

2017-06-16 Thread Samuel Klein
This is great; thank you, Stephen.  It is good to see additional transfers
to the endowment on the agenda.

Regards, SJ

On Wed, Jun 14, 2017 at 1:24 PM, Stephen LaPorte 
wrote:

> Hi all,
>
> The agenda for the next Wikimedia Foundation Board of Trustees meeting on
> June 16, 2017 is now available on Meta Wiki: https://meta.wikimedia.o
> rg/wiki/Wikimedia_Foundation_board_agenda_2017-06
>
> Best,
> Stephen
>
> --
> Stephen LaPorte
> Senior Legal Counsel
> Wikimedia Foundation
>
> *NOTICE: As an attorney for the Wikimedia Foundation, for legal and
> ethical reasons, I cannot give legal advice to, or serve as a lawyer for,
> community members, volunteers, or staff members in their personal capacity.
> For more on what this means, please see our legal disclaimer
> .*
>
> ___
> Please note: all replies sent to this mailing list will be immediately
> directed to Wikimedia-l, the public mailing list of the Wikimedia
> community. For more information about Wikimedia-l:
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
> ___
> WikimediaAnnounce-l mailing list
> wikimediaannounc...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimediaannounce-l
>
>


-- 
Samuel Klein  @metasj   w:user:sj  +1 617 529 4266
<(617)%20529-4266>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wikimedia Announcements] June 2017 agenda of the Board of Trustees

2017-06-16 Thread Pine W
I think that given how far in advance WMF Board meetings are planned, it
should be possible to know the agenda and have the materials available for
publication 2 weeks in advance.

If some last minute item comes up, it can easily be amended to the agenda.
Similarly, if for some reason an agenda item is dropped or tabled for a
future meeting, that can be done as well.

It is not my goal to box the Board into an unreasonably tight set of
constraints. I think it should be possible to achieve the benefits of
advance notification while maintaining a little flexibility to accommodate
changes. Naturally, those changes should be publicized when they become
known.

Pine
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] James Heilman joins the Board Governance Committee as a volunteer and advisory member

2017-06-16 Thread MZMcBride
Nataliia Tymkiv wrote:
>The BGC believes that in case James is approved by the Board as a Board
>member [3] it would also be a good onboarding opportunity for him.
>
>[...]
>
>[3] https://wikimediafoundation.org/wiki/Bylaws#ARTICLE_III_-_MEMBERSHIP

In case? Is there doubt regarding his upcoming appointment?

https://meta.wikimedia.org/wiki/Wikimedia_Foundation_elections/2017/Results

MZMcBride



___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wikimedia Announcements] June 2017 agenda of the Board of Trustees

2017-06-16 Thread Lodewijk
First of all: kudos to Stephen for getting the agenda more or less
consistently out before the meeting - that's already an improvement
compared to the past.

Of course any advance time would be helpful - but only if the board would
actually appreciate input. I believe the meeting schedule is regular enough
that community members could already suggest topics (not sure if the chair
or secretary would be the best point of entry for that), so there would
only be an added benefit if also supporting documents would be shared in
advance. I can imagine however that this would come at a very significant
cost - it would affect probably a whole lot of deadlines and make it harder
to make last minute updates.

Lodewijk

On Fri, Jun 16, 2017 at 9:05 PM, Pine W  wrote:

> Hi Stephen,
>
> Can board agendas, as well as slides and docs which are not security or
> privacy sensitive, be published 2 weeks in advance of meetings, please?
> This will allow community members to provide comments and ask questions
> ahead of board meetings that the board can take into consideration when the
> meeting occurs.
>
> Thanks,
> Pine
>
> Pine
>
>
> On Wed, Jun 14, 2017 at 10:24 AM, Stephen LaPorte 
> wrote:
>
> > Hi all,
> >
> > The agenda for the next Wikimedia Foundation Board of Trustees meeting on
> > June 16, 2017 is now available on Meta Wiki: https://meta.wikimedia.
> > org/wiki/Wikimedia_Foundation_board_agenda_2017-06
> >
> > Best,
> > Stephen
> >
> > --
> > Stephen LaPorte
> > Senior Legal Counsel
> > Wikimedia Foundation
> >
> > *NOTICE: As an attorney for the Wikimedia Foundation, for legal and
> > ethical reasons, I cannot give legal advice to, or serve as a lawyer for,
> > community members, volunteers, or staff members in their personal
> capacity.
> > For more on what this means, please see our legal disclaimer
> > .*
> >
> > ___
> > Please note: all replies sent to this mailing list will be immediately
> > directed to Wikimedia-l, the public mailing list of the Wikimedia
> > community. For more information about Wikimedia-l:
> > https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
> > ___
> > WikimediaAnnounce-l mailing list
> > wikimediaannounc...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikimediaannounce-l
> >
> >
> ___
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wikimedia Announcements] June 2017 agenda of the Board of Trustees

2017-06-16 Thread Rogol Domedonfors
Pine,

I think the first step in this direction is for the Board to decide whether
or not they wish to engage with the Community in this way – whether they
have the time, energy and bandwidth to handle such communications, and
whether they see the reward as commensurate with the investment.  So far it
seems that their view has been negative on both counts, but that has never
been an explicit decision, and perhaps times have changed.  I don't think
we can ask Stephen as an employee to make such changes to Board practice
without their consent.  It is probably better for all concerned, however
disappointing it may be, for the Board to be explicit and frank about their
appetite for this engagement rather than raising expectations in the
Community that they are unwilling or unable to meet.

However, a statement from the Board members would be valuable here.

"Rogol"

On Fri, Jun 16, 2017 at 8:05 PM, Pine W  wrote:

> Hi Stephen,
>
> Can board agendas, as well as slides and docs which are not security or
> privacy sensitive, be published 2 weeks in advance of meetings, please?
> This will allow community members to provide comments and ask questions
> ahead of board meetings that the board can take into consideration when the
> meeting occurs.
>
> Thanks,
> Pine
>
> Pine
>
>
> On Wed, Jun 14, 2017 at 10:24 AM, Stephen LaPorte 
> wrote:
>
> > Hi all,
> >
> > The agenda for the next Wikimedia Foundation Board of Trustees meeting on
> > June 16, 2017 is now available on Meta Wiki: https://meta.wikimedia.
> > org/wiki/Wikimedia_Foundation_board_agenda_2017-06
> >
> > Best,
> > Stephen
> >
> > --
> > Stephen LaPorte
> > Senior Legal Counsel
> > Wikimedia Foundation
> >
> > *NOTICE: As an attorney for the Wikimedia Foundation, for legal and
> > ethical reasons, I cannot give legal advice to, or serve as a lawyer for,
> > community members, volunteers, or staff members in their personal
> capacity.
> > For more on what this means, please see our legal disclaimer
> > .*
> >
> > ___
> > Please note: all replies sent to this mailing list will be immediately
> > directed to Wikimedia-l, the public mailing list of the Wikimedia
> > community. For more information about Wikimedia-l:
> > https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
> > ___
> > WikimediaAnnounce-l mailing list
> > wikimediaannounc...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikimediaannounce-l
> >
> >
> ___
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wikimedia Announcements] June 2017 agenda of the Board of Trustees

2017-06-16 Thread Pine W
Hi Stephen,

Can board agendas, as well as slides and docs which are not security or
privacy sensitive, be published 2 weeks in advance of meetings, please?
This will allow community members to provide comments and ask questions
ahead of board meetings that the board can take into consideration when the
meeting occurs.

Thanks,
Pine

Pine


On Wed, Jun 14, 2017 at 10:24 AM, Stephen LaPorte 
wrote:

> Hi all,
>
> The agenda for the next Wikimedia Foundation Board of Trustees meeting on
> June 16, 2017 is now available on Meta Wiki: https://meta.wikimedia.
> org/wiki/Wikimedia_Foundation_board_agenda_2017-06
>
> Best,
> Stephen
>
> --
> Stephen LaPorte
> Senior Legal Counsel
> Wikimedia Foundation
>
> *NOTICE: As an attorney for the Wikimedia Foundation, for legal and
> ethical reasons, I cannot give legal advice to, or serve as a lawyer for,
> community members, volunteers, or staff members in their personal capacity.
> For more on what this means, please see our legal disclaimer
> .*
>
> ___
> Please note: all replies sent to this mailing list will be immediately
> directed to Wikimedia-l, the public mailing list of the Wikimedia
> community. For more information about Wikimedia-l:
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
> ___
> WikimediaAnnounce-l mailing list
> wikimediaannounc...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimediaannounce-l
>
>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Let's set up a Tor onion service for Wikipedia

2017-06-16 Thread Faidon Liambotis
hi Alec,

On Wed, Jun 14, 2017 at 04:12:49PM +0100, Alec Muffett wrote:
> I'd love to know more about the security issues in particular.  Do please
> tell?

I don't recall finding a specific vulnerability, but last time I had a
look at EOTK a while ago, it generated an nginx config that performed a
series of steps to manipulate HTTP headers and body (HTML & Javascript)
using (hard to audit) regexps. This is not a great security practice
IMHO, as it can result in all kinds of unexpected output, especially
with user-controlled untrusted input. It's the kind of thing that has
runs the risk of generating XSS, header CRLF injection vulnerabilities
etc.

More broadly, using regexps to manipulate content means that you either
replace mentions of "upload.wikimedia.org" blindly, even
legitimate/non-href ones like a mention of it in the article text, or
you attempt to parse the syntax of HTML and Javascript with regexps,
including quotes, escape sequences, comments etc. Neither are the right
thing to do.

EOTK as I understand it also pre-generates an nginx config with a very
specific site-specific configuration, such as CSP, TLS ciphers etc.
These may are secure, but are the kind of settings we are paying a close
attention to and manage ourselves, so delegating them to a tool like
EOTK is not something we can do. That said, it may be possible to use
EOTK to bootstrap our configuration and then remove the bits that we
manually manage and care about, so I don't think this is by itself
hindering our usage of it.

> I would love to know more about what you see as the inhibitors - especially
> so that I can go fix them for the internet-community-at-large - however
> this decision is one for the Wikipedia community to take.
> 
> I'll still happily help if you decide "yes", but WMF should make and own
> the decision.

Note that there is a distinction between "the [e.g. English] Wikipedia
community" and the WMF. We are all part of the same movement but the
various wiki communities have decision-making capabilities of their own,
especially when it comes to matters such as who's allowed to edit what,
when and how. Allowing edits over Tor is not the kind of decision the
Foundation can unilaterally make, while setting up the Onion service
would be something that the Foundation would do, since it would just be
part of our infrastructure and thus our mandate.

Granted, serving the site over an Onion service is orthogonal to being
able to edit it, so it's something we may eventually do anyway, even if
the situation around editing remains the same. It does limit its scope
and usefulness though, and is thus a factor that contributes to our
prioritization (or lack thereof).

Best,
Faidon
--
Faidon Liambotis
Principal Engineer, Technical Operations
Wikimedia Foundation

___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Let's set up a Tor onion service for Wikipedia

2017-06-16 Thread Alec Muffett
On 14 June 2017 at 16:08, Brad Jorsch (Anomie) 
wrote:

> That part reminds me a bit of https://phabricator.wikimedia.org/T156847,
> which is about outputting different addresses in links for the mobile site
> versus the desktop site. The same solution might work for both onion links
> and mobile site links.


This is basically what we did at Facebook; architecture and other tips are
published at https://storify.com/AlecMuffett/tor-tips

The only real "gotcha" with such an approach is to only "onionify" links
which are in the process of being rendered to go to the user's browser; if
your software stack also makes site-internal fetches (eg: for database
access) in order to render a page, then onionifying *those* will result in
badness.

The other nice thing to bear in mind is that onionification is generally
best done with 1:1 mappings between onion addresses and DNS domains, and
that consistency is beneficial; in other words:

foo.com <-> .onion
bar.com <-> .onion

...and even if you are rendering a page for foo.com/, you'll get a
nicer experience by also fixing-up the bar.com/ HREFs, should you
happen to generate any.

This is one point where EOTK wins-out, because it operates after-the-fact
of content generation & site caching, so has a marginally easier time; the
demo EOTK config for a Wikipedia onion currently performs 11 simultaneous
mappings, as documented at:
https://github.com/alecmuffett/eotk/blob/master/demo.d/wikipedia.tconf

- alec

-- 
http://dropsafe.crypticide.com/aboutalecm
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Let's set up a Tor onion service for Wikipedia

2017-06-16 Thread Alec Muffett
On 14 June 2017 at 15:57, Faidon Liambotis  wrote:

> The EOTK stuff are interesting but not really an option for us -- they
> rely on a edge (nginx) server performing content manipulation blindly,
> which is a bad idea for many reasons, security amongst them.
>

Hi again Faidon!

I'd love to know more about the security issues in particular.  Do please
tell?


However, it hasn't been a priority for me or my team for these reasons:
> - As long as communities feel so-and-so about Tor overall, and e.g.
>   block edits from Tor users, it's hard to justify us in the Foundation
>   investing more time into it,


Concur.

I would love to know more about what you see as the inhibitors - especially
so that I can go fix them for the internet-community-at-large - however
this decision is one for the Wikipedia community to take.

I'll still happily help if you decide "yes", but WMF should make and own
the decision.

-a

ps: reminder, I'd love to know more about the security issues. :-)


-- 
http://dropsafe.crypticide.com/aboutalecm
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Update on Discovery Projects

2017-06-16 Thread Trey Jones
Hi Jim,

Determining the intent of a particular search is indeed very difficult, and
is not really feasible to even attempt it at the scale needed for machine
learning (unless you have an immense budget like some for-profit search
engine companies).

For our machine learning training data, we use click models suggested by
academic research. These models allow us to score the results for a given
query based on which results users actually clicked on (and didn't click
on). The results aren't perfect, but they are good, and they can be
automatically generated for millions of training examples taken from real
user queries and clicks.

These scores serve as a proxy for user intent, without needing to actually
understand it. As an example, if 35% of people click on the first result
for a particular query, and 60% on the second result, the click scores
would indicate that the order should be swapped, even without knowing the
intent of the query or the content of the results.

Swapping the top two results isn't really a big win, but the hope is that
by identifying features of the query (e.g., number of words), of the
articles (e.g., popularity), and of the relationship between them (e.g.,
number of words in common between the query and the article title) we will
learn something that is more generally true. If we do, then we may move a
result for a different query from, say, position 8 (where few people ever
click) to position 3 (where there is at least a chance of a click).
Iterating the whole process will allow us to detect that the result newly
in position 3 is actually a really popular result so we should adjust the
model to boost it even more, or that it's not that great and we should
adjust the model to put something better in the #3 slot. Of course, all of
the "adjusting" of the model happens automatically during training.

Through this iterative process of modeling, training, evaluation, and
deployment, we are attempting to take into account the relationship between
the user's intent and the search results—inferred from the user's
behavior—to improve the search results.

Cheers,
—Trey

Trey Jones
Software Engineer, Discovery
Wikimedia Foundation

On Fri, Jun 16, 2017 at 10:26 AM, James Salsman  wrote:

> Hi Trey,
>
> Thanks for your very detailed reply. I have a followup question.
>
> How do you determine search intents? For example, if you see someone
> searching for "rents" how do you know whether they are looking for
> economic or property rents when evaluating the quality of the search
> results? If you're training machine learning models from "5, 50, or
> 500," example you need to have labels on each of those examples
> indicating whether the results are good or not.
>
> Do you interview searchers after the fact? Ask people to search and
> record the terms they search on? What kind of infrastructure do you
> have to make sure you're getting correct intents robust enough to
> score the example results? Maybe surveys occurring on some small
> fraction of results asking users to describe in greater detail exactly
> what they were trying to find?
>
> Best regards,
> Jim
>
>
> On Thu, Jun 15, 2017 at 10:40 PM, Deborah Tankersley
>  wrote:
> > James Salsman wrote:
> >
> > How will the Foundation's approach to machine learning of search
> >> results ranking guard against overfitting?
> >
> >
> > Overfitting, for those who aren't familiar with the term, describes the
> > situation where a machine learning model inappropriately learns very
> > specific details about its training set that don't generalize to the real
> > world. From the point of view of training, the model seems to be getting
> > better and better, while real-world performance is actually decreasing.
> As
> > a somewhat silly example, a model could learn that queries that have
> > exactly 38 words in them are 100% about baseball—because there is only
> one
> > example of a query in the training set that is 38 words long, and it is
> > about baseball. For more on overfitting, see Wikipedia.[1]
> >
> > We employ the usual safeguards against overfitting. Certain parameters
> that
> > control how a specific type of model is built can discourage overfitting.
> > For example, not allowing a decision inside the model to be made on too
> > little data—so rather than 1 or 2 examples to base a decision on, the
> model
> > can be told it needs to see 5, or 50, or 500.
> >
> > We also have separate training and testing data sets. So we build a model
> > on one set of data, then evaluate the model on another set. The estimate
> of
> > model performance from the training set will always be at least a bit
> > optimistic, but the testing set—which is large enough to be
> representative
> > and which does not overlap with the training set—gives a more realistic
> > estimate. We choose the model that performs the best on the testing set.
> > Overfitted models will do worse on the testing set, and we won't use
> them.
> >
> > We have other methods of validating ou

Re: [Wikimedia-l] Update on Discovery Projects

2017-06-16 Thread James Salsman
Hi Trey,

Thanks for your very detailed reply. I have a followup question.

How do you determine search intents? For example, if you see someone
searching for "rents" how do you know whether they are looking for
economic or property rents when evaluating the quality of the search
results? If you're training machine learning models from "5, 50, or
500," example you need to have labels on each of those examples
indicating whether the results are good or not.

Do you interview searchers after the fact? Ask people to search and
record the terms they search on? What kind of infrastructure do you
have to make sure you're getting correct intents robust enough to
score the example results? Maybe surveys occurring on some small
fraction of results asking users to describe in greater detail exactly
what they were trying to find?

Best regards,
Jim


On Thu, Jun 15, 2017 at 10:40 PM, Deborah Tankersley
 wrote:
> James Salsman wrote:
>
> How will the Foundation's approach to machine learning of search
>> results ranking guard against overfitting?
>
>
> Overfitting, for those who aren't familiar with the term, describes the
> situation where a machine learning model inappropriately learns very
> specific details about its training set that don't generalize to the real
> world. From the point of view of training, the model seems to be getting
> better and better, while real-world performance is actually decreasing. As
> a somewhat silly example, a model could learn that queries that have
> exactly 38 words in them are 100% about baseball—because there is only one
> example of a query in the training set that is 38 words long, and it is
> about baseball. For more on overfitting, see Wikipedia.[1]
>
> We employ the usual safeguards against overfitting. Certain parameters that
> control how a specific type of model is built can discourage overfitting.
> For example, not allowing a decision inside the model to be made on too
> little data—so rather than 1 or 2 examples to base a decision on, the model
> can be told it needs to see 5, or 50, or 500.
>
> We also have separate training and testing data sets. So we build a model
> on one set of data, then evaluate the model on another set. The estimate of
> model performance from the training set will always be at least a bit
> optimistic, but the testing set—which is large enough to be representative
> and which does not overlap with the training set—gives a more realistic
> estimate. We choose the model that performs the best on the testing set.
> Overfitted models will do worse on the testing set, and we won't use them.
>
> We have other methods of validating our models as well.
>
> We have a set of machines and software that we collectively call Relevance
> Forge (a.k.a. RelForge) that we can use to run large sets of queries
> against different versions of the same index. We can compare the before and
> after results, both automatically and manually. RelForge lets us easily
> gauge the *impact* of a change. For example, a 1% net improvement could
> come from making 1% of queries a bit better, or from making 49% a bit worse
> and 50% a bit better. So, we can easily see whether 1% or 99% of results
> change. If we see a 2% improvement but a 99% impact, something weird is
> happening, and we'd investigate more deeply.
>
> We also have many definitions of "results change" that we can evaluate: #1
> result changes, top 3 results change (ordered or unordered), number of
> results changes, number of queries getting zero results changes. And for
> each of these we can manually inspect a random selection of affected
> queries to decide whether the results are generally better or not.
>
> We also run A/B tests, where we let a small sample of users get the
> proposed change, while a similar number get the standard results. We do
> statistical analyses on user engagement with results and various other
> click metrics that let us compare the control and experimental conditions.
> For more on how we test search changes in general, see Testing Search on
> mediawiki.org.[2]
>
> In both of these cases—RelForge testing and A/B testing in
> production—overfitted models would perform poorly, and that would become
> apparent.
>
> For example, if most searches on "rent" do not pertain to "rent
>> seeking", then how will the machine learning approach to search
>> results for "rent" guard against never presenting any results on "rent
>> seeking"?
>
>
> Your wording has left me a bit confused, and I'm not sure whether your
> concern is (a) that a query of "rent" should never return "rent seeking",
> and so the machine learning model should never present it, or (b) that we
> should guard against building a model that *never* presents results on
> "rent seeking" for a query of "rent". I'll briefly address each.
>
> Case (a): "rent" should *never* return "rent seeking"
>
> It's not clear to me that returning "rent seeking" for a query of "rent" is
> necessarily a case of overfitting per se, but in gen

Re: [Wikimedia-l] James Heilman joins the Board Governance Committee as a volunteer and advisory member

2017-06-16 Thread Biyanto Rebin
Congratulations, James!

Le 15 juin 2017 20:21, "James Heilman"  a écrit :

> Thanks :-) Looking forwards to working with everyone.
>
> J
>
> On Thu, Jun 15, 2017 at 5:29 AM, Tito Dutta  wrote:
>
> > That's great. All the best James.
> >
> > On 15 June 2017 at 17:55, Nataliia Tymkiv  wrote:
> >
> > > Dear all,
> > >
> > >
> > >
> > > I wanted to inform you that starting today James Heilman joins the
> Board
> > > Governance Committee (BGC) as a volunteer and advisory member [1]. That
> > > means that he will be working with the BGC as a non-voting member,
> > together
> > > with Ira B. Matetsky, Gayle Karen Young, Tim Moritz Hector, Ido Ivry
> and
> > > Kat Walsh [2].
> > >
> > >
> > >
> > > The BGC believes that in case James is approved by the Board as a Board
> > > member [3] it would also be a good onboarding opportunity for him.
> > >
> > >
> > >
> > > === James Heilman ===
> > > James Heilman is “an active contributor to WikiProject Medicine, is a
> > > volunteer Wikipedia administrator, was the president of Wikimedia
> Canada
> > > between 2010 and 2013, and founded and was formerly the president of
> Wiki
> > > Project Med Foundation. He is also the founder of WikiProject
> Medicine's
> > > Medicine Translation Task Force. In June 2015, he was elected to the
> > > Wikimedia Foundation Board of Trustees, a position which he held until
> he
> > > was removed on December 28, 2015” [4]. He is among top three candidates
> > for
> > > the community selected seats selected by the community to be
> recommended
> > to
> > > the Board for approval, according to the results from the 2017
> Wikimedia
> > > Foundation Board of Trustees selection process [5].
> > >
> > >
> > >
> > > If you have any questions or concerns, please feel free to ask me.
> > >
> > >
> > >
> > > [1] https://meta.wikimedia.org/wiki/Wikimedia_Foundation_
> > > Board_Governance_Committee_Charter#Volunteer_and_Advisory_Members
> > > [2] https://meta.wikimedia.org/wiki/Wikimedia_Foundation_
> > > Board_Governance_Committee#Composition_2016-2017
> > >
> > > [3] https://wikimediafoundation.org/wiki/Bylaws#ARTICLE_III_-_
> MEMBERSHIP
> > >
> > > [4] https://en.wikipedia.org/wiki/James_Heilman
> > >
> > > [5] https://blog.wikimedia.org/2017/05/20/board-of-trustees-
> > > elections-2017/
> > >
> > > Best regards,
> > > antanana / Nataliia Tymkiv
> > >
> > > *NOTICE: You may have received this message outside of your normal
> > working
> > > hours/days, as I usually can work more as a volunteer during weekend.
> You
> > > should not feel obligated to answer it during your days off. Thank you
> in
> > > advance!*
> > > ___
> > > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > > wiki/Wikimedia-l
> > > New messages to: Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > 
> > ___
> > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > wiki/Wikimedia-l
> > New messages to: Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > 
> >
>
>
>
> --
> James Heilman
> MD, CCFP-EM, Wikipedian
>
> The Wikipedia Open Textbook of Medicine
> ___
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] James Heilman joins the Board Governance Committee as a volunteer and advisory member

2017-06-16 Thread Jean-Philippe Béland
James,

That's great! Thank you very much for your involvement.

Jean-Philippe Béland
Vice President, Wikimedia Canada
Lead Organizer, Wiki Loves Earth 2017 in Canada
User:Amqui


On Fri, Jun 16, 2017 at 1:41 AM, Rogol Domedonfors 
wrote:

> James
>
> Do you have a position or preliminary views you would like to share with
> the community about Board Governance?  Is there anything specific you will
> be seeking to look into, or change, or start, or stop.  Are there areas for
> improvement or is everything fine?  In particular, do you think that
> Board-Community relations need any attention from the Governance Committee?
>
> "Rogol"
>
> On Fri, Jun 16, 2017 at 3:58 AM, Anna Stillwell 
> wrote:
>
> > Thank you so much, James. I'm so glad you are here.
> > /a
> >
> > On Thu, Jun 15, 2017 at 4:10 PM, Pine W  wrote:
> >
> > > Thank you, Nataliia and James.
> > >
> > > This appointment continues a trend of decisions and steps from the BGC
> > > since Nataliia took the committee chair role that I think are good.
> > >
> > > Pine
> > > ___
> > > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > > wiki/Wikimedia-l
> > > New messages to: Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > 
> > >
> > ___
> > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > wiki/Wikimedia-l
> > New messages to: Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > 
> >
> ___
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
>



-- 

Jean-Philippe Béland

[image: Wikimedia Canada] Vice-président — Wikimédia Canada
, chapitre national
soutenant Wikipédia
Vice president — Wikimedia Canada
, national chapter
supporting Wikipedia
535 avenue Viger Est, Montréal (Québec)  H2L 2P3,jpbel...@wikimedia.ca
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,