Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Ricordisamoa
Proper data-based stubs are being worked on: 
https://phabricator.wikimedia.org/project/profile/1416/

Lsjbot, you have no chance to survive make your time.

Il 06/09/2015 02:40, Anders Wennersten ha scritto:
Geonames [1] is a database which holds around 9 M entries of 
geographical related items from all over the world.


Lsjbot is now generating articles from a subset of it, after several 
months of extensive research on its quality, Wikidata relations and 
notability issues. While the quality in some regions is substandard 
(and these will not be generated) it was seen as very good in most 
areas.  In the discussion  I was intrigued to learn that identical 
Arabic names should be transcribed differently depending on its 
geographic location. And I was fascinated of the question of 
notability of wells in the Bahrain desert (which in the end was 
excluded, mostly because we knew too little of that reality)


In this run Lsjbot has extended its functionality even further then 
when it generated articles for species. It looks for relevant 
geographical items close to the actual one: a lake close by, a 
mountain and where is the nearest major town etc.


Macedonia  can be taken as one example. Lsjbot generated over 1 
articles (and 5000 disambiguous pages) making it a magnitude more than 
what exist in enwp. Also for a well defined type like villages, almost 
50% as many has been generated than existing in enwp. One example [2] 
where you can see what has been generated (and note the reuse of a 
relevant figure existing in frwp). Please compare the corresponding 
articles on other languages in this case, many having less information 
than the bot generated one.


The generation is still in early stage [3) but has already got the 
article count for svwp to pass 2 M  today.  But it will take many 
months more before completed and perhaps more M marks will be passed 
before it is through. If you want to give feedback you are welcome to 
enter it at [4]


Anders
(with all credits for the Lsjbot to be given to Sverker, its owner, I 
am just one of the many supporters of him and his bot on svwp)


[1]
http://www.geonames.org/about.html

[2]
https://sv.wikipedia.org/wiki/Polaki_%28ort_i_Makedonien%29

[3]
https://sv.wikipedia.org/wiki/Kategori:Robotskapade_geografiartiklar

[4]
https://sv.wikipedia.org/wiki/Anv%C3%A4ndardiskussion:Lsjbot/Projekt_alla_platser 






___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines

Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 




___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


[Wikimedia-l] Move on to another subject.... (was: Re: [Wiki Loves Monuments] Wiki Loves Monuments in Italy largely blocked by WMF fundraising)

2015-09-06 Thread Richard Ames
I think it is time to move on to another subject so lets consider this
one closed.

Regards, Richard (one of your moderators)
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Emilio J . Rodríguez-Posada
Congratulations for the stub creation, they are good (and better that those
handmade stubs in other languages).

About the Wikidata placeholder project, it sounds very interesting.

2015-09-06 2:40 GMT+02:00 Anders Wennersten :

> Geonames [1] is a database which holds around 9 M entries of geographical
> related items from all over the world.
>
> Lsjbot is now generating articles from a subset of it, after several
> months of extensive research on its quality, Wikidata relations and
> notability issues. While the quality in some regions is substandard (and
> these will not be generated) it was seen as very good in most areas.  In
> the discussion  I was intrigued to learn that identical Arabic names should
> be transcribed differently depending on its geographic location. And I was
> fascinated of the question of notability of wells in the Bahrain desert
> (which in the end was excluded, mostly because we knew too little of that
> reality)
>
> In this run Lsjbot has extended its functionality even further then when
> it generated articles for species. It looks for relevant geographical items
> close to the actual one: a lake close by, a mountain and where is the
> nearest major town etc.
>
> Macedonia  can be taken as one example. Lsjbot generated over 1
> articles (and 5000 disambiguous pages) making it a magnitude more than what
> exist in enwp. Also for a well defined type like villages, almost 50% as
> many has been generated than existing in enwp. One example [2] where you
> can see what has been generated (and note the reuse of a relevant figure
> existing in frwp). Please compare the corresponding articles on other
> languages in this case, many having less information than the bot generated
> one.
>
> The generation is still in early stage [3) but has already got the article
> count for svwp to pass 2 M  today.  But it will take many months more
> before completed and perhaps more M marks will be passed before it is
> through. If you want to give feedback you are welcome to enter it at [4]
>
> Anders
> (with all credits for the Lsjbot to be given to Sverker, its owner, I am
> just one of the many supporters of him and his bot on svwp)
>
> [1]
> http://www.geonames.org/about.html
>
> [2]
> https://sv.wikipedia.org/wiki/Polaki_%28ort_i_Makedonien%29
>
> [3]
> https://sv.wikipedia.org/wiki/Kategori:Robotskapade_geografiartiklar
>
> [4]
>
> https://sv.wikipedia.org/wiki/Anv%C3%A4ndardiskussion:Lsjbot/Projekt_alla_platser
>
>
>
>
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Gerard Meijssen
Hoi,
PLEASE reconsider. A Wikidata based solution is not superior because it
started from Wikidata.

PLEASE consider collaboration. It will be so much more powerful when LSJBOT
and people at Wikidata collaborate. It will get things right the first
time. It does not have to be perfect from the start as long as it gets
better over time. As long as we always work on improving the data.

PLEASE consider text generation based on Wikidata. They are the scripts
LSJBOT uses, they can help us improve the text when more or better
information becomes available.
Thanks,
 GerardM

On 6 September 2015 at 08:25, Ricordisamoa 
wrote:

> Proper data-based stubs are being worked on:
> https://phabricator.wikimedia.org/project/profile/1416/
> Lsjbot, you have no chance to survive make your time.
>
>
> Il 06/09/2015 02:40, Anders Wennersten ha scritto:
>
>> Geonames [1] is a database which holds around 9 M entries of geographical
>> related items from all over the world.
>>
>> Lsjbot is now generating articles from a subset of it, after several
>> months of extensive research on its quality, Wikidata relations and
>> notability issues. While the quality in some regions is substandard (and
>> these will not be generated) it was seen as very good in most areas.  In
>> the discussion  I was intrigued to learn that identical Arabic names should
>> be transcribed differently depending on its geographic location. And I was
>> fascinated of the question of notability of wells in the Bahrain desert
>> (which in the end was excluded, mostly because we knew too little of that
>> reality)
>>
>> In this run Lsjbot has extended its functionality even further then when
>> it generated articles for species. It looks for relevant geographical items
>> close to the actual one: a lake close by, a mountain and where is the
>> nearest major town etc.
>>
>> Macedonia  can be taken as one example. Lsjbot generated over 1
>> articles (and 5000 disambiguous pages) making it a magnitude more than what
>> exist in enwp. Also for a well defined type like villages, almost 50% as
>> many has been generated than existing in enwp. One example [2] where you
>> can see what has been generated (and note the reuse of a relevant figure
>> existing in frwp). Please compare the corresponding articles on other
>> languages in this case, many having less information than the bot generated
>> one.
>>
>> The generation is still in early stage [3) but has already got the
>> article count for svwp to pass 2 M  today.  But it will take many months
>> more before completed and perhaps more M marks will be passed before it is
>> through. If you want to give feedback you are welcome to enter it at [4]
>>
>> Anders
>> (with all credits for the Lsjbot to be given to Sverker, its owner, I am
>> just one of the many supporters of him and his bot on svwp)
>>
>> [1]
>> http://www.geonames.org/about.html
>>
>> [2]
>> https://sv.wikipedia.org/wiki/Polaki_%28ort_i_Makedonien%29
>>
>> [3]
>> https://sv.wikipedia.org/wiki/Kategori:Robotskapade_geografiartiklar
>>
>> [4]
>>
>> https://sv.wikipedia.org/wiki/Anv%C3%A4ndardiskussion:Lsjbot/Projekt_alla_platser
>>
>>
>>
>>
>> ___
>> Wikimedia-l mailing list, guidelines at:
>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
>> Wikimedia-l@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>> 
>>
>
>
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [[wikimedia-I] Wiki Loves Monuments] Wiki Loves Monuments in Italy largely blocked by WMF fundraising

2015-09-06 Thread WereSpielChequers
Craig has a good point, but this is a lovely example of two projects planning 
on very different timescales. The most important countries for a future Wiki 
Loves Monuments campaign are some of the very countries that fundraising could 
have had this September, the ones where there has not yet been a WLM but where 
a group of volunteers will emerge in the next few months. How can one predict 
where they will be?


WereSpielChequers


>   3
> Message: 3
> Date: Sun, 6 Sep 2015 13:39:25 +1000
> From: Craig Franklin 
> To: Wikimedia Mailing List 
> Subject: Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves Monuments
>in Italy largely blocked by WMF fundraising
> Message-ID:
>
> Content-Type: text/plain; charset=UTF-8
> 
> Firstly, I'm delighted to see that a mutually acceptable compromise has
> been reached here.  Well done everyone in coming together with the best
> interests of the entire movement in mind.
> 
> If I can make a suggestion though, I'd suggest that the fundraising team
> and the community, particularly the WLM crew, get together *now* and try to
> work out how those campaigns are going to be coordinated so that this
> doesn't happen again next year, while there are still good vibes in the
> air.  Something we're all really bad at as a movement, is procrastinating
> on these sort of issues, but if there is a bit of forward planning there's
> no reason that everyone can't have their cake and eat it too.
> 
> Cheers,
> Craig
> 
> 

___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves Monuments in Italy largely blocked by WMF fundraising

2015-09-06 Thread Peter Southwood
Lila, As you have appended your message to this thread, I assume that there is 
a non-zero probability that you are referring to one or more of the  
contributors to the thread. I do not consider any of the mails included to be 
polarizing rhetoric, so could you inform us whether you do, and if so, which, 
or whether there was another reason why that specific comment was appended here.
Cheers,
Peter

-Original Message-
From: wikimedia-l-boun...@lists.wikimedia.org 
[mailto:wikimedia-l-boun...@lists.wikimedia.org] On Behalf Of Lila Tretikov
Sent: Saturday, 05 September 2015 7:16 AM
To: Wikimedia Mailing List
Subject: Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves Monuments in Italy 
largely blocked by WMF fundraising

First, thanks to all of those who worked in good faith, with patience and care 
for each other to solve this problem. I appreciate the level of compromise and 
empathy that was required from teams at WMIL and WMF. Thank you!

Second, I want to highlight that this is a *our* issue, we are a community and 
we need to think about our *one mission* to engage every human with knowledge, 
before our individual goals. Let's please remember that before we detract and 
distract with polarizing rhetoric (you know who you are on this list). Bring up 
issues, suggest solutions. But please, in good faith and with care for each 
other.

Thanks all,
Lila


On Fri, Sep 4, 2015 at 8:26 AM, Peter Southwood < peter.southw...@telkomsa.net> 
wrote:

> I was referring to the fundraising targets, which have been cited as a 
> cause of the dispute. WMIT/WLM have explained at length their reasons 
> for needing banners in September. I am in no position to comment on 
> whether their analysis is correct or not . Fundraising has not been so 
> forthcoming in response to queries.
> Cheers,
> Peter
>
> -Original Message-
> From: wikimedia-l-boun...@lists.wikimedia.org [mailto:
> wikimedia-l-boun...@lists.wikimedia.org] On Behalf Of Pine W
> Sent: Friday, 04 September 2015 1:45 PM
> To: Wikimedia Mailing List
> Subject: Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves Monuments 
> in Italy largely blocked by WMF fundraising
>
> I guess I'm not clear on whether you're asking about the Fundraising 
> targets or the WLM/WMIT targets, or both. Can you clarify?
>
> My understanding from this email chain is that there will be a 
> deconfliction of banner space via better scheduling next year. I think 
> that someone suggested setting up a calendar to track banner use, 
> which might also be helpful.
>
> I think I'll step out of this conversation for the moment, and let the 
> stakeholders take it from here.
>
> Pine
>
>
> On Fri, Sep 4, 2015 at 4:25 AM, Peter Southwood < 
> peter.southw...@telkomsa.net> wrote:
>
> > One of the basic tenets of health and safety, is that if you have a 
> > near miss incident, it should be analysed the same way that a fatal 
> > incident would be investigated. Not to apportion blame, even if it 
> > is due, but so that the same situation can be avoided in the future.
> > Organisations that fail to do this are doomed to repeat their 
> > mistakes, not necessarily by the same people, who may well have 
> > learned, but often by other departments, where the people did not 
> > get
> the opportunity to learn by the mistake.
> > Refusal to answer reasonable and legitimate questions by 
> > stakeholders often leads to accusations of conspiracy and bad faith 
> > and can end in the local demagogues, of which we have an adequate 
> > supply, inciting the torch and pitchfork brigade. Things may go downhill at 
> > this point.
> > Cheers,
> > Peter
> >
> > -Original Message-
> > From: wikimedia-l-boun...@lists.wikimedia.org [mailto:
> > wikimedia-l-boun...@lists.wikimedia.org] On Behalf Of Pine W
> > Sent: Friday, 04 September 2015 8:43 AM
> > To: Wikimedia Mailing List
> > Subject: Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves 
> > Monuments in Italy largely blocked by WMF fundraising
> >
> > Yes, I think it is the case that Fundraising and other organizations 
> > (like the WLM coordinators, Wikimedia Italia, and Community 
> > Resources / FDC) were working from different playbooks. But now that 
> > Fundraising has agreed to change their plans, I think we should give 
> > them some breathing room, especially because they say that banner 
> > scheduling will be coordinated next year.
> >
> >
> >
> > Pine
> >
> >
> > On Thu, Sep 3, 2015 at 11:24 PM, Peter Southwood < 
> > peter.southw...@telkomsa.net> wrote:
> >
> > > Who set the targets that will now not be met, how were they 
> > > decided, and when were they set? I must assume that WLM annual 
> > > project was not taken into consideration by these planners.
> > > Cheers,
> > > P
> > >
> > > -Original Message-
> > > From: wikimedia-l-boun...@lists.wikimedia.org [mailto:
> > > wikimedia-l-boun...@lists.wikimedia.org] On Behalf Of Pine W
> > > Sent: Thursday, 03 September 2015 10:51 PM
> > > To: Wikimedia 

Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Gerard Meijssen
Hoi,
As always I have been a big fan of the wonderful work that has been done.
My reaction was very much for what I perceived as a negative reaction from
Ricordisamoa. Telling you to stop and become part of Wikidata is a bit off.
Asking for collaboration and work towards a common goal, a goal that you
very much want to share as I perceive it in your reply is most wonderful
and most welcome.

When your data is at a quality level where you create stubs, it is very
much at the level where we should have it in Wikidata. Obviously it is for
the Swedish community to have the stubs or experiment with cached articles
based on Wikidata data. Obviously, we are at a point where we can create
the stubs and where caching concepts is technically feasible but not
something we have done so far.

What does it take to have such an experiment?
Thanks,
 GerardM

On 6 September 2015 at 11:23, Anders Wennersten 
wrote:

> At svwp we work closely with Wikidata and see it as the natural base for
> our article substance. And we follow closely Phabricator and are eager to
> implement it as soon as it will be feasible to implement. And Lsjbot is in
> no way counteractive to these. It will be easy to exchange Lsjbot article
> with Phabricator generated ones when time is right.
>
> But I believe you miss the point with what Lsjbot is doing now.  The
> extensive research etc done on data in Geonames is one of the crucial
> efforts. And in a way all this generation project is a research on the
> viability to use this data for full in all language versions. If it still
> is seen as viable we could extend our article coverage for geographical
> entities with a factor 10 in all versions. And this research is a must even
> independently of which technique is used to generate the articles.
>
> The other crucial effort is the extended intelligence built into the
> generation of  facts in the articles. To find out close by physical object
> by clever algorithms is a intellectual effort of highest dignity. First
> when bot generating was introduced, it was more or less a mapping of items
> from input to items in output (in articles). We now see how more info is
> created by info only implicit existing in input and where it is combined
> with external (map) data
>
> I can not enough press on how much I am impressed by Sverkers outstanding
> intellectual effort and his creativity in implementing and running software
> that is of great help reaching our common vision "free knowledge for all".
>
>  Anders
>
>
>
>
>
> Den 2015-09-06 kl. 08:50, skrev Gerard Meijssen:
>
>> Hoi,
>> PLEASE reconsider. A Wikidata based solution is not superior because it
>> started from Wikidata.
>>
>> PLEASE consider collaboration. It will be so much more powerful when
>> LSJBOT
>> and people at Wikidata collaborate. It will get things right the first
>> time. It does not have to be perfect from the start as long as it gets
>> better over time. As long as we always work on improving the data.
>>
>> PLEASE consider text generation based on Wikidata. They are the scripts
>> LSJBOT uses, they can help us improve the text when more or better
>> information becomes available.
>> Thanks,
>>   GerardM
>>
>> On 6 September 2015 at 08:25, Ricordisamoa 
>> wrote:
>>
>> Proper data-based stubs are being worked on:
>>> https://phabricator.wikimedia.org/project/profile/1416/
>>> Lsjbot, you have no chance to survive make your time.
>>>
>>>
>>> Il 06/09/2015 02:40, Anders Wennersten ha scritto:
>>>
>>> Geonames [1] is a database which holds around 9 M entries of geographical
 related items from all over the world.

 Lsjbot is now generating articles from a subset of it, after several
 months of extensive research on its quality, Wikidata relations and
 notability issues. While the quality in some regions is substandard (and
 these will not be generated) it was seen as very good in most areas.  In
 the discussion  I was intrigued to learn that identical Arabic names
 should
 be transcribed differently depending on its geographic location. And I
 was
 fascinated of the question of notability of wells in the Bahrain desert
 (which in the end was excluded, mostly because we knew too little of
 that
 reality)

 In this run Lsjbot has extended its functionality even further then when
 it generated articles for species. It looks for relevant geographical
 items
 close to the actual one: a lake close by, a mountain and where is the
 nearest major town etc.

 Macedonia  can be taken as one example. Lsjbot generated over 1
 articles (and 5000 disambiguous pages) making it a magnitude more than
 what
 exist in enwp. Also for a well defined type like villages, almost 50% as
 many has been generated than existing in enwp. One example [2] where you
 can see what has been generated (and note the reuse of a relevant figure

Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Steinsplitter Wiki
Hoi,

"Article Placeholders are automatically generated content pages in 
Wikipedia or other mediawiki projects displaying data from Wikidata."   
Seriously? RobotWiki? Do we really want this? Quality, not quantity. 

> From: gerard.meijs...@gmail.com
> Date: Sun, 6 Sep 2015 11:35:31 +0200
> To: wikimedia-l@lists.wikimedia.org
> Subject: Re: [Wikimedia-l] LsJbot and geonames
> 
> Hoi,
> As always I have been a big fan of the wonderful work that has been done.
> My reaction was very much for what I perceived as a negative reaction from
> Ricordisamoa. Telling you to stop and become part of Wikidata is a bit off.
> Asking for collaboration and work towards a common goal, a goal that you
> very much want to share as I perceive it in your reply is most wonderful
> and most welcome.
> 
> When your data is at a quality level where you create stubs, it is very
> much at the level where we should have it in Wikidata. Obviously it is for
> the Swedish community to have the stubs or experiment with cached articles
> based on Wikidata data. Obviously, we are at a point where we can create
> the stubs and where caching concepts is technically feasible but not
> something we have done so far.
> 
> What does it take to have such an experiment?
> Thanks,
>  GerardM
> 
> On 6 September 2015 at 11:23, Anders Wennersten 
> wrote:
> 
> > At svwp we work closely with Wikidata and see it as the natural base for
> > our article substance. And we follow closely Phabricator and are eager to
> > implement it as soon as it will be feasible to implement. And Lsjbot is in
> > no way counteractive to these. It will be easy to exchange Lsjbot article
> > with Phabricator generated ones when time is right.
> >
> > But I believe you miss the point with what Lsjbot is doing now.  The
> > extensive research etc done on data in Geonames is one of the crucial
> > efforts. And in a way all this generation project is a research on the
> > viability to use this data for full in all language versions. If it still
> > is seen as viable we could extend our article coverage for geographical
> > entities with a factor 10 in all versions. And this research is a must even
> > independently of which technique is used to generate the articles.
> >
> > The other crucial effort is the extended intelligence built into the
> > generation of  facts in the articles. To find out close by physical object
> > by clever algorithms is a intellectual effort of highest dignity. First
> > when bot generating was introduced, it was more or less a mapping of items
> > from input to items in output (in articles). We now see how more info is
> > created by info only implicit existing in input and where it is combined
> > with external (map) data
> >
> > I can not enough press on how much I am impressed by Sverkers outstanding
> > intellectual effort and his creativity in implementing and running software
> > that is of great help reaching our common vision "free knowledge for all".
> >
> >  Anders
> >
> >
> >
> >
> >
> > Den 2015-09-06 kl. 08:50, skrev Gerard Meijssen:
> >
> >> Hoi,
> >> PLEASE reconsider. A Wikidata based solution is not superior because it
> >> started from Wikidata.
> >>
> >> PLEASE consider collaboration. It will be so much more powerful when
> >> LSJBOT
> >> and people at Wikidata collaborate. It will get things right the first
> >> time. It does not have to be perfect from the start as long as it gets
> >> better over time. As long as we always work on improving the data.
> >>
> >> PLEASE consider text generation based on Wikidata. They are the scripts
> >> LSJBOT uses, they can help us improve the text when more or better
> >> information becomes available.
> >> Thanks,
> >>   GerardM
> >>
> >> On 6 September 2015 at 08:25, Ricordisamoa 
> >> wrote:
> >>
> >> Proper data-based stubs are being worked on:
> >>> https://phabricator.wikimedia.org/project/profile/1416/
> >>> Lsjbot, you have no chance to survive make your time.
> >>>
> >>>
> >>> Il 06/09/2015 02:40, Anders Wennersten ha scritto:
> >>>
> >>> Geonames [1] is a database which holds around 9 M entries of geographical
>  related items from all over the world.
> 
>  Lsjbot is now generating articles from a subset of it, after several
>  months of extensive research on its quality, Wikidata relations and
>  notability issues. While the quality in some regions is substandard (and
>  these will not be generated) it was seen as very good in most areas.  In
>  the discussion  I was intrigued to learn that identical Arabic names
>  should
>  be transcribed differently depending on its geographic location. And I
>  was
>  fascinated of the question of notability of wells in the Bahrain desert
>  (which in the end was excluded, mostly because we knew too little of
>  that
>  reality)
> 
>  In this run Lsjbot has extended its functionality even further then when
>  it 

Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Anders Wennersten
At svwp we work closely with Wikidata and see it as the natural base for 
our article substance. And we follow closely Phabricator and are eager 
to implement it as soon as it will be feasible to implement. And Lsjbot 
is in no way counteractive to these. It will be easy to exchange Lsjbot 
article with Phabricator generated ones when time is right.


But I believe you miss the point with what Lsjbot is doing now.  The 
extensive research etc done on data in Geonames is one of the crucial 
efforts. And in a way all this generation project is a research on the 
viability to use this data for full in all language versions. If it 
still is seen as viable we could extend our article coverage for 
geographical entities with a factor 10 in all versions. And this 
research is a must even independently of which technique is used to 
generate the articles.


The other crucial effort is the extended intelligence built into the 
generation of  facts in the articles. To find out close by physical 
object by clever algorithms is a intellectual effort of highest dignity. 
First when bot generating was introduced, it was more or less a mapping 
of items from input to items in output (in articles). We now see how 
more info is created by info only implicit existing in input and where 
it is combined with external (map) data


I can not enough press on how much I am impressed by Sverkers 
outstanding intellectual effort and his creativity in implementing and 
running software that is of great help reaching our common vision "free 
knowledge for all".


 Anders




Den 2015-09-06 kl. 08:50, skrev Gerard Meijssen:

Hoi,
PLEASE reconsider. A Wikidata based solution is not superior because it
started from Wikidata.

PLEASE consider collaboration. It will be so much more powerful when LSJBOT
and people at Wikidata collaborate. It will get things right the first
time. It does not have to be perfect from the start as long as it gets
better over time. As long as we always work on improving the data.

PLEASE consider text generation based on Wikidata. They are the scripts
LSJBOT uses, they can help us improve the text when more or better
information becomes available.
Thanks,
  GerardM

On 6 September 2015 at 08:25, Ricordisamoa 
wrote:


Proper data-based stubs are being worked on:
https://phabricator.wikimedia.org/project/profile/1416/
Lsjbot, you have no chance to survive make your time.


Il 06/09/2015 02:40, Anders Wennersten ha scritto:


Geonames [1] is a database which holds around 9 M entries of geographical
related items from all over the world.

Lsjbot is now generating articles from a subset of it, after several
months of extensive research on its quality, Wikidata relations and
notability issues. While the quality in some regions is substandard (and
these will not be generated) it was seen as very good in most areas.  In
the discussion  I was intrigued to learn that identical Arabic names should
be transcribed differently depending on its geographic location. And I was
fascinated of the question of notability of wells in the Bahrain desert
(which in the end was excluded, mostly because we knew too little of that
reality)

In this run Lsjbot has extended its functionality even further then when
it generated articles for species. It looks for relevant geographical items
close to the actual one: a lake close by, a mountain and where is the
nearest major town etc.

Macedonia  can be taken as one example. Lsjbot generated over 1
articles (and 5000 disambiguous pages) making it a magnitude more than what
exist in enwp. Also for a well defined type like villages, almost 50% as
many has been generated than existing in enwp. One example [2] where you
can see what has been generated (and note the reuse of a relevant figure
existing in frwp). Please compare the corresponding articles on other
languages in this case, many having less information than the bot generated
one.

The generation is still in early stage [3) but has already got the
article count for svwp to pass 2 M  today.  But it will take many months
more before completed and perhaps more M marks will be passed before it is
through. If you want to give feedback you are welcome to enter it at [4]

Anders
(with all credits for the Lsjbot to be given to Sverker, its owner, I am
just one of the many supporters of him and his bot on svwp)

[1]
http://www.geonames.org/about.html

[2]
https://sv.wikipedia.org/wiki/Polaki_%28ort_i_Makedonien%29

[3]
https://sv.wikipedia.org/wiki/Kategori:Robotskapade_geografiartiklar

[4]

https://sv.wikipedia.org/wiki/Anv%C3%A4ndardiskussion:Lsjbot/Projekt_alla_platser




___
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,





Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Ricordisamoa

Hoi,
Wouldn't have it been a better use of Sverker's brainpower to implement 
a long-term solution that doesn't require saving articles to the wiki?


Il 06/09/2015 11:23, Anders Wennersten ha scritto:
At svwp we work closely with Wikidata and see it as the natural base 
for our article substance. And we follow closely Phabricator and are 
eager to implement it as soon as it will be feasible to implement. And 
Lsjbot is in no way counteractive to these. It will be easy to 
exchange Lsjbot article with Phabricator generated ones when time is 
right.


But I believe you miss the point with what Lsjbot is doing now. The 
extensive research etc done on data in Geonames is one of the crucial 
efforts. And in a way all this generation project is a research on the 
viability to use this data for full in all language versions. If it 
still is seen as viable we could extend our article coverage for 
geographical entities with a factor 10 in all versions. And this 
research is a must even independently of which technique is used to 
generate the articles.


The other crucial effort is the extended intelligence built into the 
generation of  facts in the articles. To find out close by physical 
object by clever algorithms is a intellectual effort of highest 
dignity. First when bot generating was introduced, it was more or less 
a mapping of items from input to items in output (in articles). We now 
see how more info is created by info only implicit existing in input 
and where it is combined with external (map) data


I can not enough press on how much I am impressed by Sverkers 
outstanding intellectual effort and his creativity in implementing and 
running software that is of great help reaching our common vision 
"free knowledge for all".


 Anders




Den 2015-09-06 kl. 08:50, skrev Gerard Meijssen:

Hoi,
PLEASE reconsider. A Wikidata based solution is not superior because it
started from Wikidata.

PLEASE consider collaboration. It will be so much more powerful when 
LSJBOT

and people at Wikidata collaborate. It will get things right the first
time. It does not have to be perfect from the start as long as it gets
better over time. As long as we always work on improving the data.

PLEASE consider text generation based on Wikidata. They are the scripts
LSJBOT uses, they can help us improve the text when more or better
information becomes available.
Thanks,
  GerardM

On 6 September 2015 at 08:25, Ricordisamoa 


wrote:


Proper data-based stubs are being worked on:
https://phabricator.wikimedia.org/project/profile/1416/
Lsjbot, you have no chance to survive make your time.


Il 06/09/2015 02:40, Anders Wennersten ha scritto:

Geonames [1] is a database which holds around 9 M entries of 
geographical

related items from all over the world.

Lsjbot is now generating articles from a subset of it, after several
months of extensive research on its quality, Wikidata relations and
notability issues. While the quality in some regions is substandard 
(and
these will not be generated) it was seen as very good in most 
areas.  In
the discussion  I was intrigued to learn that identical Arabic 
names should
be transcribed differently depending on its geographic location. 
And I was
fascinated of the question of notability of wells in the Bahrain 
desert
(which in the end was excluded, mostly because we knew too little 
of that

reality)

In this run Lsjbot has extended its functionality even further then 
when
it generated articles for species. It looks for relevant 
geographical items

close to the actual one: a lake close by, a mountain and where is the
nearest major town etc.

Macedonia  can be taken as one example. Lsjbot generated over 1
articles (and 5000 disambiguous pages) making it a magnitude more 
than what
exist in enwp. Also for a well defined type like villages, almost 
50% as
many has been generated than existing in enwp. One example [2] 
where you
can see what has been generated (and note the reuse of a relevant 
figure

existing in frwp). Please compare the corresponding articles on other
languages in this case, many having less information than the bot 
generated

one.

The generation is still in early stage [3) but has already got the
article count for svwp to pass 2 M  today.  But it will take many 
months
more before completed and perhaps more M marks will be passed 
before it is
through. If you want to give feedback you are welcome to enter it 
at [4]


Anders
(with all credits for the Lsjbot to be given to Sverker, its owner, 
I am

just one of the many supporters of him and his bot on svwp)

[1]
http://www.geonames.org/about.html

[2]
https://sv.wikipedia.org/wiki/Polaki_%28ort_i_Makedonien%29

[3]
https://sv.wikipedia.org/wiki/Kategori:Robotskapade_geografiartiklar

[4]

https://sv.wikipedia.org/wiki/Anv%C3%A4ndardiskussion:Lsjbot/Projekt_alla_platser 






___
Wikimedia-l mailing list, guidelines at:

Re: [Wikimedia-l] LsJbot and geonames

2015-09-06 Thread Emilio J . Rodríguez-Posada
2015-09-06 13:22 GMT+02:00 Steinsplitter Wiki :

> Hoi,
>
> "Article Placeholders are automatically generated content pages in
> Wikipedia or other mediawiki projects displaying data from Wikidata."
>  Seriously? RobotWiki? Do we really want this? Quality, not quantity.
>
>
Yeah. I REALLY want this.


> > From: gerard.meijs...@gmail.com
> > Date: Sun, 6 Sep 2015 11:35:31 +0200
> > To: wikimedia-l@lists.wikimedia.org
> > Subject: Re: [Wikimedia-l] LsJbot and geonames
> >
> > Hoi,
> > As always I have been a big fan of the wonderful work that has been done.
> > My reaction was very much for what I perceived as a negative reaction
> from
> > Ricordisamoa. Telling you to stop and become part of Wikidata is a bit
> off.
> > Asking for collaboration and work towards a common goal, a goal that you
> > very much want to share as I perceive it in your reply is most wonderful
> > and most welcome.
> >
> > When your data is at a quality level where you create stubs, it is very
> > much at the level where we should have it in Wikidata. Obviously it is
> for
> > the Swedish community to have the stubs or experiment with cached
> articles
> > based on Wikidata data. Obviously, we are at a point where we can create
> > the stubs and where caching concepts is technically feasible but not
> > something we have done so far.
> >
> > What does it take to have such an experiment?
> > Thanks,
> >  GerardM
> >
> > On 6 September 2015 at 11:23, Anders Wennersten <
> m...@anderswennersten.se>
> > wrote:
> >
> > > At svwp we work closely with Wikidata and see it as the natural base
> for
> > > our article substance. And we follow closely Phabricator and are eager
> to
> > > implement it as soon as it will be feasible to implement. And Lsjbot
> is in
> > > no way counteractive to these. It will be easy to exchange Lsjbot
> article
> > > with Phabricator generated ones when time is right.
> > >
> > > But I believe you miss the point with what Lsjbot is doing now.  The
> > > extensive research etc done on data in Geonames is one of the crucial
> > > efforts. And in a way all this generation project is a research on the
> > > viability to use this data for full in all language versions. If it
> still
> > > is seen as viable we could extend our article coverage for geographical
> > > entities with a factor 10 in all versions. And this research is a must
> even
> > > independently of which technique is used to generate the articles.
> > >
> > > The other crucial effort is the extended intelligence built into the
> > > generation of  facts in the articles. To find out close by physical
> object
> > > by clever algorithms is a intellectual effort of highest dignity. First
> > > when bot generating was introduced, it was more or less a mapping of
> items
> > > from input to items in output (in articles). We now see how more info
> is
> > > created by info only implicit existing in input and where it is
> combined
> > > with external (map) data
> > >
> > > I can not enough press on how much I am impressed by Sverkers
> outstanding
> > > intellectual effort and his creativity in implementing and running
> software
> > > that is of great help reaching our common vision "free knowledge for
> all".
> > >
> > >  Anders
> > >
> > >
> > >
> > >
> > >
> > > Den 2015-09-06 kl. 08:50, skrev Gerard Meijssen:
> > >
> > >> Hoi,
> > >> PLEASE reconsider. A Wikidata based solution is not superior because
> it
> > >> started from Wikidata.
> > >>
> > >> PLEASE consider collaboration. It will be so much more powerful when
> > >> LSJBOT
> > >> and people at Wikidata collaborate. It will get things right the first
> > >> time. It does not have to be perfect from the start as long as it gets
> > >> better over time. As long as we always work on improving the data.
> > >>
> > >> PLEASE consider text generation based on Wikidata. They are the
> scripts
> > >> LSJBOT uses, they can help us improve the text when more or better
> > >> information becomes available.
> > >> Thanks,
> > >>   GerardM
> > >>
> > >> On 6 September 2015 at 08:25, Ricordisamoa <
> ricordisa...@openmailbox.org>
> > >> wrote:
> > >>
> > >> Proper data-based stubs are being worked on:
> > >>> https://phabricator.wikimedia.org/project/profile/1416/
> > >>> Lsjbot, you have no chance to survive make your time.
> > >>>
> > >>>
> > >>> Il 06/09/2015 02:40, Anders Wennersten ha scritto:
> > >>>
> > >>> Geonames [1] is a database which holds around 9 M entries of
> geographical
> >  related items from all over the world.
> > 
> >  Lsjbot is now generating articles from a subset of it, after several
> >  months of extensive research on its quality, Wikidata relations and
> >  notability issues. While the quality in some regions is substandard
> (and
> >  these will not be generated) it was seen as very good in most
> areas.  In
> >  the discussion  I was intrigued to learn that identical Arabic names
> >  should
> >  be 

Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves Monuments in Italy largely blocked by WMF fundraising

2015-09-06 Thread Romaine Wiki
I think everyone here worked in good faith, and everyone started with
patience in this situation. Suggesting otherwise suggests a lack of empathy.
But we should not close our eyes when the community is being played in a
non fair way, and I then mean NOT the bocking banner, but how the
interaction went. This issue is not the first time that the fundraising
team has shown us rude behaviour.

There was no polarizing rhetoric, the rhetoric used was used when the
polarisation already happened. And was only used when the edge of care,
patience and reasonable was passed long ago. It is nice to call here in
public that we should bring up issues and suggest solutions, but we have
done so.

I find it disturbing that WMF does not recognise their own worse behaviour
(of *some* staff), and sticks their head in the sand.

If you say that it is *our* issue, a different attitude should be used: the
community has not been treated as a stakeholder, while we are.
As long as the community is not actually seen as stakeholder, it makes
highlighting *one mission* being empty words

It is said (by WMF staff) that we should come to a better planning of
CentralNotice banners, we are open for that as we have called for this
already 2013. We are open to this and are waiting.

Greetings,
Romaine





2015-09-05 7:15 GMT+02:00 Lila Tretikov :

> First, thanks to all of those who worked in good faith, with patience and
> care for each other to solve this problem. I appreciate the level of
> compromise and empathy that was required from teams at WMIL and WMF. Thank
> you!
>
> Second, I want to highlight that this is a *our* issue, we are a community
> and we need to think about our *one mission* to engage every human with
> knowledge, before our individual goals. Let's please remember that before
> we detract and distract with polarizing rhetoric (you know who you are on
> this list). Bring up issues, suggest solutions. But please, in good faith
> and with care for each other.
>
> Thanks all,
> Lila
>
>
> On Fri, Sep 4, 2015 at 8:26 AM, Peter Southwood <
> peter.southw...@telkomsa.net> wrote:
>
> > I was referring to the fundraising targets, which have been cited as a
> > cause of the dispute. WMIT/WLM have explained at length their reasons for
> > needing banners in September. I am in no position to comment on whether
> > their analysis is correct or not . Fundraising has not been so
> forthcoming
> > in response to queries.
> > Cheers,
> > Peter
> >
> > -Original Message-
> > From: wikimedia-l-boun...@lists.wikimedia.org [mailto:
> > wikimedia-l-boun...@lists.wikimedia.org] On Behalf Of Pine W
> > Sent: Friday, 04 September 2015 1:45 PM
> > To: Wikimedia Mailing List
> > Subject: Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves Monuments in
> > Italy largely blocked by WMF fundraising
> >
> > I guess I'm not clear on whether you're asking about the Fundraising
> > targets or the WLM/WMIT targets, or both. Can you clarify?
> >
> > My understanding from this email chain is that there will be a
> > deconfliction of banner space via better scheduling next year. I think
> that
> > someone suggested setting up a calendar to track banner use, which might
> > also be helpful.
> >
> > I think I'll step out of this conversation for the moment, and let the
> > stakeholders take it from here.
> >
> > Pine
> >
> >
> > On Fri, Sep 4, 2015 at 4:25 AM, Peter Southwood <
> > peter.southw...@telkomsa.net> wrote:
> >
> > > One of the basic tenets of health and safety, is that if you have a
> > > near miss incident, it should be analysed the same way that a fatal
> > > incident would be investigated. Not to apportion blame, even if it is
> > > due, but so that the same situation can be avoided in the future.
> > > Organisations that fail to do this are doomed to repeat their
> > > mistakes, not necessarily by the same people, who may well have
> > > learned, but often by other departments, where the people did not get
> > the opportunity to learn by the mistake.
> > > Refusal to answer reasonable and legitimate questions by stakeholders
> > > often leads to accusations of conspiracy and bad faith and can end in
> > > the local demagogues, of which we have an adequate supply, inciting
> > > the torch and pitchfork brigade. Things may go downhill at this point.
> > > Cheers,
> > > Peter
> > >
> > > -Original Message-
> > > From: wikimedia-l-boun...@lists.wikimedia.org [mailto:
> > > wikimedia-l-boun...@lists.wikimedia.org] On Behalf Of Pine W
> > > Sent: Friday, 04 September 2015 8:43 AM
> > > To: Wikimedia Mailing List
> > > Subject: Re: [Wikimedia-l] [Wiki Loves Monuments] Wiki Loves Monuments
> > > in Italy largely blocked by WMF fundraising
> > >
> > > Yes, I think it is the case that Fundraising and other organizations
> > > (like the WLM coordinators, Wikimedia Italia, and Community Resources
> > > / FDC) were working from different playbooks. But now that Fundraising
> > > has agreed to change their plans, I