Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-15 Thread Victor Venema
To also add some positive feedback from researchers: you are fully 
welcome to translate my research into humanly readable text. It would 
have to be enormously badly made before people would confuse a readable 
text with a scientific article and I have no fears that people would 
think a scientist would have written the readable version. I would see 
the situation similar to a translation in another language. 
Non-problematic and useful.

 From my side there are no problems with using wikipedia. There have 
been several studies showing that Wikipedia is as accurate as 
traditional encyclopaedias. I mostly wrote the Wiki page pertaining my 
field of study; I think it is reasonably good.

I have installed an add-on for my browser where I can select a word and 
directly open Wikipedia on that term. Very useful. Similarly it may be 
useful to make your translation engine as independent of the search 
engine as possible, so that it can also be used in other contexts.

The features you describe can also be useful for scientists reading 
scientific articles, especially when they are not native speakers or 
people doing interdisciplinary work. Then showing simpler terms and 
pictures would also be very helpful. So the translation engine could 
also be a good add-on for a browser or a PDF reader.

My main worry would be that the problem will not reach its societal 
aims. Already now there is more information on vaccinations and climate 
change in readable language on the net than any person will ever read.

People chose not to read it because they do not want to change their 
opinion, especially when it gets them into conflict with their social 
peers. The AI translated articles may be better readable than the 
original scientific articles, but would still be horrible scientific 
articles. I would expect even less people to read them.

Transparency done right can help the scientific community. But I am more 
sceptical that it can bridge the gap between science and the public. The 
BBC Reith lecture on trust makes a strong case, imho, that transparency 
does not reduce, but actually fuels, a culture of suspicion.
http://www.bbc.co.uk/radio4/reith2002/lecture1.shtml

-- 
<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>
Victor Venema
Chair WMO TT-HOM & ISTI-POST

WMO, Commission for Climatology, Task Team on Homogenization
http://tinyurl.com/TT-HOM
ISTI Parallel Observations Science Team
http://tinyurl.com/ISTI-POST
Grassroots scientific publishing
http://grassrootspublishing.wordpress.com/

Meteorological Institute
University of Bonn
Auf dem Huegel 20
53121 Bonn
Germany

E-mail: victor.ven...@uni-bonn.de
http://www2.meteo.uni-bonn.de/victor
http://variable-variability.blogspot.com
Twitter: @variabilityblog
Tel: +49 (0)228 73 5185
Fax: +49 (0)228 73 5188

There is no need to answer my mails in your free time.
<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>
___
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal


Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-15 Thread Jason Priem
itrary simplicity level. Many readers can
> *almost* understand a scholarly article already, and helping them out is
> easier. We're raising scaffolding, not building houses.
>
> We are working in the zone of proximal development (ZPD) [1], the zone
> where a person can succeed *, if they have help*. We're trying expand the
> ZPD as wide as we can, for as many people as we can.
>
> So again, we'll start with the easier, low-risk parts of the
> problem...we'll focus on readers who are already pretty well-prepared and
> well-motivated, folks like citizen scientists, community college
> instructors, and well-informed patients. We'll start with people whose ZPD
> already *almost* reaches a given article.We're confident we can
> substantially improve the reading experience of these folks, using
> completely off-the-shelf technology.
>
> Then we iterate, building on the lessons users teach us. Over time, we'll
> learn to build more and more robust scaffolding, * *based on real
> experience with real readers**. Engaging closely with users shows us the
> leverage points where we can deploy AI tech most effectively, as well as
> the few places where we need to be actually pushing the tech forward.
>
> This incremental, iterative strategy both reduces risk, and supports very
> fast development. We think that over two years, it'll get us to the point
> where we reach audiences that would never before have considered reading
> the scholarly literature on their own.
>
> But time, as always, will tell. ⏳😃
>
> [1]  https://en.wikipedia.org/wiki/Zone_of_proximal_development
>
>
>
> - It is with this in mind that it is so important that the underlying code
> from this project is open source and transparent, so that the improvements
> or failures that arise from the project can be adopted and adapted by the
> wider community. In this way the project can have research benefits in
> addition to the possible dissemination/public exposure gains.
>
>
> Couldn't agree more. We've been an open source shop since we got started
> at a hackathon seven years agothat certainly won't be changing for this
> project. You can find our code at https://github.com/Impactstory
>
>
>
> - While I believe Heather's interpretation of moral rights is incorrect*,
> I think you should carefully think about the legal ramifications of a
> project like this. At the very least a clear demarcation between the
> original work and the derived explanations should be immediately obvious to
> users of the application, and an opt-out/takedown mechanism be implemented.
> Presumably a way to include and exclude the different parts of the
> Explanation Engine could be interesting too, for example by being able to
> turn on or off the
>
>
> Agreed. Good points.
>
>
>
> These caveats aside, I look forward to seeing the results of your work,
> and would like to offer my own CC-licensed work as guinea pig for the
> Engine if you need an actual author to give feedback on the quality of the
> annotations and summaries that are produced by your concept machine.
>
>
> Thanks very much Henrik, we may take you up on that!
>
> Best,
> Jason
>
>
>
>
> Regards,
> Henrik Karlstrøm
>
> * The Berne Declarations does NOT unilaterally grant authors the right to
> object to modifications of a work that "affects their reputation" - that
> would make for example literary criticism or academic debate very hard. For
> the Right to Integrity to be violated, it is not enough that the author
> feels that there is a violation, the author must be able to demonstrate
> prejudice on the part of the modifiers, and demonstrate it "on an objective
> standard that is based on public or expert opinion in order to establish
> that the author's opinion was reasonable" [2]. Of course, this only matters
> in jurisdictions that recognize moral rights. The US, for example, does
> not...
>
> [1] Bornmann, L. Scientometrics (2018). https://doi.org/10.1007/s11192
> -018-2855-y
> [2] https://en.wikipedia.org/wiki/Prise_de_parole_Inc_v_Gu%C3%A9
> rin,_%C3%A9diteur_Lt%C3%A9e
>
>
> >-Original Message-
> >From: goal-boun...@eprints.org  On Behalf Of
> >Heather Morrison
> >Sent: Friday, July 13, 2018 8:32 PM
> >To: Global Open Access List (Successor of AmSci) < goal@eprints.org>
> >Subject: Re: [GOAL] Why translating all scholarly knowledge for
> non-specialists
> >using AI is complicated
> >
> >It is easy to cherry-pick some examples of where this might work and not
> be
> >problematic. This is useful as an analytic exercise to demonstrate the
> potential.
> >H

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-15 Thread Chris Zielinski
and helping them out is easier. We're raising scaffolding, not building houses.   We are working in the zone of proximal development (ZPD) [1], the zone where a person can succeed , if they have help. We're trying expand the ZPD as wide as we can, for as many people as we can.So again, we'll start with the easier, low-risk parts of the problem...we'll focus on readers who are already pretty well-prepared and well-motivated, folks like citizen scientists, community college instructors, and well-informed patients. We'll start with people whose ZPD already *almost* reaches a given article.We're confident we can substantially improve the reading experience of these folks, using completely off-the-shelf technology. Then we iterate, building on the lessons users teach us. Over time, we'll learn to build more and more robust scaffolding, * based on real experience with real readers*. Engaging closely with users shows us the leverage points where we can deploy AI tech most effectively, as well as the few places where we need to be actually pushing the tech forward. This incremental, iterative strategy both reduces risk, and supports very fast development. We think that over two years, it'll get us to the point where we reach audiences that would never before have considered reading the scholarly literature on their own.But time, as always, will tell. ⏳😃 [1]  https://en.wikipedia.org/wiki/Zone_of_proximal_development  - It is with this in mind that it is so important that the underlying code from this project is open source and transparent, so that the improvements or failures that arise from the project can be adopted and adapted by the wider community. In this way the project can have research benefits in addition to the possible dissemination/public exposure gains. Couldn't agree more. We've been an open source shop since we got started at a hackathon seven years agothat certainly won't be changing for this project. You can find our code at https://github.com/Impactstory  - While I believe Heather's interpretation of moral rights is incorrect*, I think you should carefully think about the legal ramifications of a project like this. At the very least a clear demarcation between the original work and the derived explanations should be immediately obvious to users of the application, and an opt-out/takedown mechanism be implemented. Presumably a way to include and exclude the different parts of the Explanation Engine could be interesting too, for example by being able to turn on or off the Agreed. Good points.  These caveats aside, I look forward to seeing the results of your work, and would like to offer my own CC-licensed work as guinea pig for the Engine if you need an actual author to give feedback on the quality of the annotations and summaries that are produced by your concept machine. Thanks very much Henrik, we may take you up on that!Best,Jason   Regards,  Henrik Karlstrøm   * The Berne Declarations does NOT unilaterally grant authors the right to object to modifications of a work that "affects their reputation" - that would make for example literary criticism or academic debate very hard. For the Right to Integrity to be violated, it is not enough that the author feels that there is a violation, the author must be able to demonstrate prejudice on the part of the modifiers, and demonstrate it "on an objective standard that is based on public or expert opinion in order to establish that the author's opinion was reasonable" [2]. Of course, this only matters in jurisdictions that recognize moral rights. The US, for example, does not...   [1] Bornmann, L. Scientometrics (2018). https://doi.org/10.1007/s11192-018-2855-y  [2] https://en.wikipedia.org/wiki/Prise_de_parole_Inc_v_Gu%C3%A9rin,_%C3%A9diteur_Lt%C3%A9e    >-----Original Message- >From: goal-boun...@eprints.org <goal-boun...@eprints.org> On Behalf Of >Heather Morrison >Sent: Friday, July 13, 2018 8:32 PM  >To: Global Open Access List (Successor of AmSci) < goal@eprints.org>  >Subject: Re: [GOAL] Why translating all scholarly knowledge for non-specialists  >using AI is complicated  >  >It is easy to cherry-pick some examples of where this might work and not be  >problematic. This is useful as an analytic exercise to demonstrate the potential.  >However it is important to consider and assess negative as well as positive  >possible consequences.  >  >With respect to violation of author's moral rights, under Berne 6bis  > http://www.wipo.int/treaties/en/text.jsp?file_id=283698 authors have the right  >to object to certain modifications of their work, that may impact the authors  >reputation, even after transfer of all economic rights. Reputation is critical to an  >academic career.  >  >Has anyone conducted research to find out whether academic authors consider  >Wikipedia annotations to be an

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-14 Thread Jason Priem
nefits in
> addition to the possible dissemination/public exposure gains.
>

Couldn't agree more. We've been an open source shop since we got started at
a hackathon seven years agothat certainly won't be changing for this
project. You can find our code at https://github.com/Impactstory


>
> - While I believe Heather's interpretation of moral rights is incorrect*,
> I think you should carefully think about the legal ramifications of a
> project like this. At the very least a clear demarcation between the
> original work and the derived explanations should be immediately obvious to
> users of the application, and an opt-out/takedown mechanism be implemented.
> Presumably a way to include and exclude the different parts of the
> Explanation Engine could be interesting too, for example by being able to
> turn on or off the
>

Agreed. Good points.


>
> These caveats aside, I look forward to seeing the results of your work,
> and would like to offer my own CC-licensed work as guinea pig for the
> Engine if you need an actual author to give feedback on the quality of the
> annotations and summaries that are produced by your concept machine.
>

Thanks very much Henrik, we may take you up on that!

Best,
Jason


>
>
> Regards,
> Henrik Karlstrøm
>
> * The Berne Declarations does NOT unilaterally grant authors the right to
> object to modifications of a work that "affects their reputation" - that
> would make for example literary criticism or academic debate very hard. For
> the Right to Integrity to be violated, it is not enough that the author
> feels that there is a violation, the author must be able to demonstrate
> prejudice on the part of the modifiers, and demonstrate it "on an objective
> standard that is based on public or expert opinion in order to establish
> that the author's opinion was reasonable" [2]. Of course, this only matters
> in jurisdictions that recognize moral rights. The US, for example, does
> not...
>
> [1] Bornmann, L. Scientometrics (2018). https://doi.org/10.1007/
> s11192-018-2855-y
> [2] https://en.wikipedia.org/wiki/Prise_de_parole_Inc_v_Gu%C3%
> A9rin,_%C3%A9diteur_Lt%C3%A9e
>
>
> >-Original Message-
> >From: goal-boun...@eprints.org  On Behalf Of
> >Heather Morrison
> >Sent: Friday, July 13, 2018 8:32 PM
> >To: Global Open Access List (Successor of AmSci) 
> >Subject: Re: [GOAL] Why translating all scholarly knowledge for
> non-specialists
> >using AI is complicated
> >
> >It is easy to cherry-pick some examples of where this might work and not
> be
> >problematic. This is useful as an analytic exercise to demonstrate the
> potential.
> >However it is important to consider and assess negative as well as
> positive
> >possible consequences.
> >
> >With respect to violation of author's moral rights, under Berne 6bis
> >http://www.wipo.int/treaties/en/text.jsp?file_id=283698 authors have the
> right
> >to object to certain modifications of their work, that may impact the
> authors
> >reputation, even after transfer of all economic rights. Reputation is
> critical to an
> >academic career.
> >
> >Has anyone conducted research to find out whether academic authors
> consider
> >Wikipedia annotations to be an acceptable modification of their work?
> >
> >As an academic author, after using CC licenses permitting modifications
> for many
> >years, after careful consideration, I have stopped doing this. Your work
> for me
> >reinforces the wisdom of this decision. I do not wish my work to be
> annotated or
> >automatically summarized by your project. I suspect that other academic
> authors
> >will share this perspective. This may include authors who have chosen
> liberal
> >licenses without realizing that they have inadvertently granted
> permission for
> >such experiments.
> >
> >CC licenses with the attribution element include author moral rights and
> remedies
> >for violation of such rights.
> >
> >My advice is to limit this experiment to willing participants. For the
> avoidance of
> >doubt: I object to your group annotating or automatically summarizing my
> work.
> >
> >Thank you for the offer to contribute to your project. These posts to
> GOAL are
> >my contribution.
> >
> >best,
> >
> >Heather Morrison
> >
> >
> >From: goal-boun...@eprints.org  on behalf of
> >Jason Priem 
> >Sent: Friday, July 13, 2018 1:35:51 PM
> >To: Global Open Access List (Successor of AmSci)
> >Subject: Re: [GOAL] Why translating all scholarly knowledge fo

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-14 Thread Henrik Karlstrøm
Hello all,

I'd like to offer a few comments in this discussion. First I'd like to say that 
I think this project is very exciting and very much in line with the spirit of 
Open Access, by striving to make available the wealth of the world's knowledge. 
This also means to make it understandable. I am sure that this type of system 
will one day be important in the dissemination of expert knowledge.

Having said that, there are some important questions that I think would have to 
be answered for this project to reap these benefits:

- First of all I very much agree with the subject line of Heather's original 
email. While this is hardly "self-driving car, real-time processing of multiple 
source inputs in life-and-death-situations"-hard, automated parsing of semantic 
content is plenty difficult, and an area where a lot of work remains to be 
done. To take just one example, a new paper investigating the AI-assisted 
subject classification methods of the Dimensions app finds serious problems of 
reliability and validity of the procedure, even going so far as to say that 
"most of the papers seem misclassified" [1]. The Explanation Engine will need 
to make some serious headway in automated semantic analysis to achieve its 
goals.

- It is with this in mind that it is so important that the underlying code from 
this project is open source and transparent, so that the improvements or 
failures that arise from the project can be adopted and adapted by the wider 
community. In this way the project can have research benefits in addition to 
the possible dissemination/public exposure gains.

- While I believe Heather's interpretation of moral rights is incorrect*, I 
think you should carefully think about the legal ramifications of a project 
like this. At the very least a clear demarcation between the original work and 
the derived explanations should be immediately obvious to users of the 
application, and an opt-out/takedown mechanism be implemented. Presumably a way 
to include and exclude the different parts of the Explanation Engine could be 
interesting too, for example by being able to turn on or off the 

These caveats aside, I look forward to seeing the results of your work, and 
would like to offer my own CC-licensed work as guinea pig for the Engine if you 
need an actual author to give feedback on the quality of the annotations and 
summaries that are produced by your concept machine.


Regards,
Henrik Karlstrøm

* The Berne Declarations does NOT unilaterally grant authors the right to 
object to modifications of a work that "affects their reputation" - that would 
make for example literary criticism or academic debate very hard. For the Right 
to Integrity to be violated, it is not enough that the author feels that there 
is a violation, the author must be able to demonstrate prejudice on the part of 
the modifiers, and demonstrate it "on an objective standard that is based on 
public or expert opinion in order to establish that the author's opinion was 
reasonable" [2]. Of course, this only matters in jurisdictions that recognize 
moral rights. The US, for example, does not...

[1] Bornmann, L. Scientometrics (2018). 
https://doi.org/10.1007/s11192-018-2855-y
[2] 
https://en.wikipedia.org/wiki/Prise_de_parole_Inc_v_Gu%C3%A9rin,_%C3%A9diteur_Lt%C3%A9e


>-Original Message-
>From: goal-boun...@eprints.org  On Behalf Of
>Heather Morrison
>Sent: Friday, July 13, 2018 8:32 PM
>To: Global Open Access List (Successor of AmSci) 
>Subject: Re: [GOAL] Why translating all scholarly knowledge for non-specialists
>using AI is complicated
>
>It is easy to cherry-pick some examples of where this might work and not be
>problematic. This is useful as an analytic exercise to demonstrate the 
>potential.
>However it is important to consider and assess negative as well as positive
>possible consequences.
>
>With respect to violation of author's moral rights, under Berne 6bis
>http://www.wipo.int/treaties/en/text.jsp?file_id=283698 authors have the right
>to object to certain modifications of their work, that may impact the authors
>reputation, even after transfer of all economic rights. Reputation is critical 
>to an
>academic career.
>
>Has anyone conducted research to find out whether academic authors consider
>Wikipedia annotations to be an acceptable modification of their work?
>
>As an academic author, after using CC licenses permitting modifications for 
>many
>years, after careful consideration, I have stopped doing this. Your work for me
>reinforces the wisdom of this decision. I do not wish my work to be annotated 
>or
>automatically summarized by your project. I suspect that other academic authors
>will share this perspective. This may include authors who have chosen liberal
>licenses without realizing that they have inadvertently g

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-13 Thread Heather Morrison
It is easy to cherry-pick some examples of where this might work and not be 
problematic. This is useful as an analytic exercise to demonstrate the 
potential. However it is important to consider and assess negative as well as 
positive possible consequences.

With respect to violation of author's moral rights, under Berne 6bis 
http://www.wipo.int/treaties/en/text.jsp?file_id=283698 authors have the right 
to object to certain modifications of their work, that may impact the authors 
reputation, even after transfer of all economic rights. Reputation is critical 
to an academic career.

Has anyone conducted research to find out whether academic authors consider 
Wikipedia annotations to be an acceptable modification of their work?

As an academic author, after using CC licenses permitting modifications for 
many years, after careful consideration, I have stopped doing this. Your work 
for me reinforces the wisdom of this decision. I do not wish my work to be 
annotated or automatically summarized by your project. I suspect that other 
academic authors will share this perspective. This may include authors who have 
chosen liberal licenses without realizing that they have inadvertently granted 
permission for such experiments.

CC licenses with the attribution element include author moral rights and 
remedies for violation of such rights.

My advice is to limit this experiment to willing participants. For the 
avoidance of doubt: I object to your group annotating or automatically 
summarizing my work. 

Thank you for the offer to contribute to your project. These posts to GOAL are 
my contribution. 

best,

Heather Morrison 


From: goal-boun...@eprints.org  on behalf of Jason 
Priem 
Sent: Friday, July 13, 2018 1:35:51 PM
To: Global Open Access List (Successor of AmSci)
Subject: Re: [GOAL] Why translating all scholarly knowledge for non-specialists 
using AI is complicated

Thanks Heather for your continued comments! Good stuff in there. Some responses 
below:



HM: Q1: to clarify, we are talking about peer-reviewed journal articles, right? 
You are planning to annotate journal articles that are written and vetted by 
experts using definitions that are developed by anyone who chooses to 
participate in Wikipedia / Wikidata, i.e. annotating works that are carefully 
vetted by experts using the contributions of non-experts?

Correct. An example may be useful here:

The article "More than 75 percent decline over 27 years in total flying insect 
biomass in protected areas" was published in 2017 by PLOS ONE [1], and appeared 
in hundreds of news stories and thousands of tweets [2]. It's open access which 
is great. But if you try to read the article, you run into sentences like this:

"Here, we used a standardized protocol to measure total insect biomass using 
Malaise traps, deployed over 27 years in 63 nature protection areas in Germany 
(96 unique location-year combinations) to infer on the status and trend of 
local entomofauna."

Even as a somewhat well-educated person, I sure don't know what a Malaise trap 
is, or what entomofauna is. The more I trip over words and concepts like this, 
the less I want to read the article. I feel like it's just...not for me.

But Wiktionary can tell me entomofauna means "insect fauna," [3] and Wikipedia 
can show me a picture of a Malaise trap (it looks like a tent, turns out) [4].

We're going to bring those kinds of descriptions and definitions right next to 
the text, so it will feel a bit more like this article IS for me. This isn't 
going to make the article magically easy to understand, but we think it will 
help open a door that makes engaging with the literature a bit more inviting. 
Our early tests with this are very promising.

That said, we're certainly going to be iterating on it a lot, and we're not 
actually attached to any particular implementation details. The goal is to help 
laypeople access the literature, and do it responsibly. If this turns out to be 
impossible with this approach, then we'll move on to another one.

For us, the key to the Explanation Engine idea is to be modular and flexible, 
using multiple layered techniques, in order to reduce risk and increase speed.


[1] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0185809
[2] https://www.altmetric.com/details/27610705
[3] https://en.wiktionary.org/wiki/entomofauna
[4] https://en.wikipedia.org/wiki/Malaise_trap




Q2: who made the decision that this is safe, and how was this decision made?

Hm, perhaps I should've been more careful in my original statement. Apologies. 
There's certainly no formal Decision here...I'm just suggesting that we think 
the risk of spreading misinformation is relatively low with this approach.  
That's why we'll start there. But the proof will need to be in the pudding, of 
course. We'll need to implement this,

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-13 Thread Jason Priem
Thanks Heather for your continued comments! Good stuff in there. Some
responses below:



> HM: Q1: to clarify, we are talking about peer-reviewed journal articles,
> right? You are planning to annotate journal articles that are written and
> vetted by experts using definitions that are developed by anyone who
> chooses to participate in Wikipedia / Wikidata, i.e. annotating works that
> are carefully vetted by experts using the contributions of non-experts?
>

Correct. An example may be useful here:

The article "More than 75 percent decline over 27 years in total flying
insect biomass in protected areas" was published in 2017 by PLOS ONE [1],
and appeared in hundreds of news stories and thousands of tweets [2]. It's
open access which is great. But if you try to read the article, you run
into sentences like this:

"Here, we used a standardized protocol to measure total insect biomass
using Malaise traps, deployed over 27 years in 63 nature protection areas
in Germany (96 unique location-year combinations) to infer on the status
and trend of local entomofauna."

Even as a somewhat well-educated person, I sure don't know what a Malaise
trap is, or what entomofauna is. The more I trip over words and concepts
like this, the less I want to read the article. I feel like it's just...not
for me.

But Wiktionary can tell me entomofauna means "insect fauna," [3] and
Wikipedia can show me a picture of a Malaise trap (it looks like a tent,
turns out) [4].

We're going to bring those kinds of descriptions and definitions right next
to the text, so it will feel a bit more like this article IS for me. This
isn't going to make the article magically easy to understand, but we think
it will help open a door that makes engaging with the literature a bit more
inviting. Our early tests with this are very promising.

That said, we're certainly going to be iterating on it a lot, and we're not
actually attached to any particular implementation details. The goal is to
help laypeople access the literature, and do it responsibly. If this turns
out to be impossible with this approach, then we'll move on to another one.

For us, the key to the Explanation Engine idea is to be modular and
flexible, using multiple layered techniques, in order to reduce risk and
increase speed.


[1] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0185809
[2] https://www.altmetric.com/details/27610705
[3] https://en.wiktionary.org/wiki/entomofauna
[4] https://en.wikipedia.org/wiki/Malaise_trap




> Q2: who made the decision that this is safe, and how was this decision
> made?
>

Hm, perhaps I should've been more careful in my original statement.
Apologies. There's certainly no formal Decision here...I'm just suggesting
that we think the risk of spreading misinformation is relatively low with
this approach.  That's why we'll start there. But the proof will need to be
in the pudding, of course. We'll need to implement this, test it, and so on.

Maybe I'm wrong and this is actually a horrible, dangerous idea.

If so, we'll find out, and take it from there. Thanks for letting us know
you are concerned it's not safe. We' take that seriously and so we'll make
sure we are evaluating this feature carefully. If you're interested in
helping with that, we'd love to have your input as well...drop me a line
off-list and we can talk about how to work together on it.


>
> If the author has not given permission, this is a violation of the
> author's moral rights under copyright. This includes all CC licensed works
> except CC-0.
>

I'm not sure I see how this would be true? We are not modifying the text or
failing to give credit to the original author, but rather creating a
commentary on it...quite like one might do if discussing the paper in a
journal club.

I am not opposed to your project, just the assumption that a two-year
> project is sufficient to create a real-world system to translate all
> scholarly knowledge for the lay reader.
>

Makes sense. You may be right...could be a quixotic errand. We will do our
best, and hopefully whatever we come up with will be a step in the right
direction, at least. I think something like this could make the world a
better place, and maybe if we aren't able to achieve it we can at least
help give some ideas to the people who ultimately do.


 A cautious and iterative approach is wise; however this is not feasible in
> the context of a two-year grant. May I suggest a small pilot project? Try
> this with a few articles in an area where at least one member of your team
> has a doctorate. Take the time to evaluate the summaries. If they look okay
> to your team, plan a larger evaluation project involving other experts and
> the lay readers you are aiming to engage (because what an expert thinks a
> summary says may not be the same as how a non-expert would interpret the
> same summary).
>

I think this sounds great! Your plan is very much what we have in mind to
do. And then we will continue from there on the "ca

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-13 Thread Bosman, J.M. (Jeroen)
Dear Heather, all,

Just a few comments below in your post.

Jeroen Bosman

 Original message 
From: Heather Morrison 
Date: 13/07/2018 17:54 (GMT+01:00)
To: "Global Open Access List (Successor of AmSci)" 
Subject: Re: [GOAL] Why translating all scholarly knowledge for non-specialists 
using AI is complicated


Further questions / comments for Jason Priem (JP) and anyone who cares to 
participate...


JP:  So the first part will be the annotation of difficult words in the text, 
which is just a mash-up of basic named-entity recognition and 
Wikipedia/Wikidata definitions. Pretty easy, pretty safe.

HM: Q1: to clarify, we are talking about peer-reviewed journal articles, right? 
You are planning to annotate journal articles that are written and vetted by 
experts using definitions that are developed by anyone who chooses to 
participate in Wikipedia / Wikidata, i.e. annotating works that are carefully 
vetted by experts using the contributions of non-experts?

》》》 It is too simple to use a dichotomy of experts-non experts. This is a 
graded difference. Also it is very incorrect to suppose that Wikipedia is not 
also contributed to by experts. There are numerous examples of active and 
emeritus scholars editing lots of science articles in wikipedia. You might even 
find that definitions used in the journal articles themselves are sourced from 
wikipedia.

Q2: who made the decision that this is safe, and how was this decision made?

Comments:

I submit that this is not safe. There are reasons for careful vetting of 
expertise, through a long process of education and examination, review in the 
process of hiring, making decisions about tenure, promotion, and grant 
applications, and then peer review and editing of the work of those qualified 
to have their work considered. Mine is not an elitist perspective. There are 
areas where the expertise does not lie in the academy at all; examples include 
traditional knowledge and native languages.

If the author has not given permission, this is a violation of the author's 
moral rights under copyright. This includes all CC licensed works except CC-0.


》》》 Interesting but questionable angle. If this is generated on the fly just as 
with litteral translation tools and not published I do not see how it would be 
a violation. The plain language explanations could also be posted as 
'comments': "we think this abstracts means ".

JP: Another set of features will be automatically categorizing trials as to 
whether they are double-blind RCTs or not, and automatically finding systematic 
reviews. These are all pretty easy technically, and pretty unlikely to point 
people in the wrong directions. But the start adding value right away, making 
it easier for laypeople to engage with the literature.

HM: this does not seem problematic and seems likely to be primarily useful to 
scholars. I am not opposed to your project, just the assumption that a two-year 
project is sufficient to create a real-world system to translate all scholarly 
knowledge for the lay reader.

JP:  From there we'll move on to the harder stuff like the automatic 
summarization. Cautiously, and iteratively. We certainly won't be rolling 
anything out to everyone right away. It's a two-year grant, and we're looking 
at that as two years of continued development, with constant feedback from 
users as well as experts in the library and public outreach worlds. If 
something doesn't work, we throw it away. Part of the process.

HM: this is highly problematic. A cautious and iterative approach is wise; 
however this is not feasible in the context of a two-year grant. May I suggest 
a small pilot project? Try this with a few articles in an area where at least 
one member of your team has a doctorate. Take the time to evaluate the 
summaries. If they look okay to your team, plan a larger evaluation project 
involving other experts and the lay readers you are aiming to engage (because 
what an expert thinks a summary says may not be the same as how a non-expert 
would interpret the same summary).

Thank you for posting openly about the approach and for the opportunity to 
comment.


best,



Heather Morrison

Associate Professor, School of Information Studies, University of Ottawa

Professeur Agrégé, École des Sciences de l'Information, Université d'Ottawa

heather.morri...@uottawa.ca<mailto:heather.morri...@uottawa.ca>

https://uniweb.uottawa.ca/?lang=en#/members/706

_
___
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal


Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-13 Thread Heather Morrison
Further questions / comments for Jason Priem (JP) and anyone who cares to 
participate...


JP:  So the first part will be the annotation of difficult words in the text, 
which is just a mash-up of basic named-entity recognition and 
Wikipedia/Wikidata definitions. Pretty easy, pretty safe.

HM: Q1: to clarify, we are talking about peer-reviewed journal articles, right? 
You are planning to annotate journal articles that are written and vetted by 
experts using definitions that are developed by anyone who chooses to 
participate in Wikipedia / Wikidata, i.e. annotating works that are carefully 
vetted by experts using the contributions of non-experts?

Q2: who made the decision that this is safe, and how was this decision made?

Comments:

I submit that this is not safe. There are reasons for careful vetting of 
expertise, through a long process of education and examination, review in the 
process of hiring, making decisions about tenure, promotion, and grant 
applications, and then peer review and editing of the work of those qualified 
to have their work considered. Mine is not an elitist perspective. There are 
areas where the expertise does not lie in the academy at all; examples include 
traditional knowledge and native languages.

If the author has not given permission, this is a violation of the author's 
moral rights under copyright. This includes all CC licensed works except CC-0.

JP: Another set of features will be automatically categorizing trials as to 
whether they are double-blind RCTs or not, and automatically finding systematic 
reviews. These are all pretty easy technically, and pretty unlikely to point 
people in the wrong directions. But the start adding value right away, making 
it easier for laypeople to engage with the literature.

HM: this does not seem problematic and seems likely to be primarily useful to 
scholars. I am not opposed to your project, just the assumption that a two-year 
project is sufficient to create a real-world system to translate all scholarly 
knowledge for the lay reader.

JP:  From there we'll move on to the harder stuff like the automatic 
summarization. Cautiously, and iteratively. We certainly won't be rolling 
anything out to everyone right away. It's a two-year grant, and we're looking 
at that as two years of continued development, with constant feedback from 
users as well as experts in the library and public outreach worlds. If 
something doesn't work, we throw it away. Part of the process.

HM: this is highly problematic. A cautious and iterative approach is wise; 
however this is not feasible in the context of a two-year grant. May I suggest 
a small pilot project? Try this with a few articles in an area where at least 
one member of your team has a doctorate. Take the time to evaluate the 
summaries. If they look okay to your team, plan a larger evaluation project 
involving other experts and the lay readers you are aiming to engage (because 
what an expert thinks a summary says may not be the same as how a non-expert 
would interpret the same summary).

Thank you for posting openly about the approach and for the opportunity to 
comment.


best,



Heather Morrison

Associate Professor, School of Information Studies, University of Ottawa

Professeur Agrégé, École des Sciences de l'Information, Université d'Ottawa

heather.morri...@uottawa.ca

https://uniweb.uottawa.ca/?lang=en#/members/706

_
___
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal


Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-12 Thread Jason Priem
Hi Heather,
Thanks for taking the time articulate your concerns, and in a very clear
and constructive way.

We agree that there can be a dark side to AI, and therefor in response to
your feedback we will be ending development of some planned features, such
as giving the Explanation Engine control over the US nuclear arsenal [1]
and the spacecraft pod bay doors [2]. 😜

In seriousness, though: totally agreed that we need to proceed with
caution. The corpus of scholarly articles is a knowledge resource of
fantastic power, and with that power comes great responsibility (I just
cannot stop referencing movies in this post, it seems).

When someone types "do vaccines cause autism?" into our search box, what we
send back could make the world a much better place, or a much worse one. We
best make DARN SURE it is the latter. And the more we rely on algorithms to
decide what comes back, the more we are taking risks with untried
technology, in a situation where there are real risks to real people.

Mitigating those risks is a key focus for this project. Our main strategy
to do this is to start small and iterate. Or to quote Ian MacKaye,
we'll "make do with what you have, take what you can get" [3] and keep
moving forward from there.

That will take a few forms. First, the Explanation Engine itself is meant
to be a modular suite of technologies and tools, allowing us to cut our
losses if some technologies don't pan out.

In the website's words, we'll be "adding notes to the text that define and
explain difficult words and phrasesAnd that’s just the start…we're also
working on concept maps, automated plain-language translations (think
automatic Simple Wikipedia), structured abstracts, topic guides, and more."

So the first part will be the annotation of difficult words in the text,
which is just a mash-up of basic named-entity recognition and
Wikipedia/Wikidata definitions. Pretty easy, pretty safe. Another set of
features will be automatically categorizing trials as to whether they are
double-blind RCTs or not, and automatically finding systematic reviews.
These are all pretty easy technically, and pretty unlikely to point people
in the wrong directions. But the start adding value right away, making it
easier for laypeople to engage with the literature.

>From there we'll move on to the harder stuff like the automatic
summarization. Cautiously, and iteratively. We certainly won't be rolling
anything out to everyone right away. It's a two-year grant, and we're
looking at that as two years of continued development, with constant
feedback from users as well as experts in the library and public outreach
worlds. If something doesn't work, we throw it away. Part of the process.

Relatedly, we'll be launching an early beta quite soon (this fall,
probably), to a few thousand early-access users (if you want to be among
these, you can sign up at https://gettheresearch.org). We will no doubt
find plenty of places where we *thought* the AI was giving clear and useful
assistance, that turn out to be full of errors. Then we fix em or ditch em.
Our early-access group will be really important to us, since they allow us
to filter out broken features before they hit the Whole World.

By keeping things modular, working iteratively, and getting lots of
feedback as we go, we're optimistic that we'll mitigate the all-to-common
mistakes of hubristic, techno-utopian thinking--and at the same time we can
harness recent tech advances to help build a more inclusive, just, and
empowering way to access humankind's collected knowledge.
j

[1] https://en.wikipedia.org/wiki/Skynet_(Terminator)
[2] https://en.wikipedia.org/wiki/HAL_9000
[3] https://www.youtube.com/watch?v=Sdocmu6CyFs

On Thu, Jul 12, 2018 at 1:03 PM, Donald Samulack - Editage <
donald.samul...@editage.com> wrote:

> Yes, but you have to start somewhere!
>
>
>
> There is a quote out there (whether accurate or not) that if Henry Ford
> had asked his customers what they wanted, they would have asked for a
> faster horse. Who would ever have thought of a self-driving car, or even a
> flying car… well, many, actually – and they made it happen!
>
>
>
> My point is that you have no idea what an exercise of this manner will
> spin off as a result of the effort – that is why it is called “research”.
> The goal is a lofty one, but there will be huge wins in scientific language
> AI along the way. Who knows, it may be necessary for multi-year journeys
> for lay-person trips to Mars, if something goes wrong with the spaceship
> along the way (communication delays will be prohibitive to effect any value
> from Earth; AI will be required for local support).
>
>
>
>
>
> Cheers,
>
>
> Don
>
>
>
> -
>
>
>
> Donald Samulack, PhD
>
> President, U.S. Operations
>
> Cactus Communications, Inc.
>
> Editage, a division of Cactus Communications
>
>
>
>
>
> *From:* goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] *On
> Behalf Of *Heather Morrison
> *Sent:* Thursday, J

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-12 Thread Heather Morrison
Agreed - one has to start somewhere, and research on using AI to advance 
knowledge makes a lot of sense. Self-driving cars is a good analogy. Start with 
research on how-to and the issues that arise (like getting machines to make 
decisions about who to kill), then you do a lot of testing before you release 
cars into streets where human beings are walking, cycling, and driving.


The same principle applies to scholarly knowledge. If you produce an automated 
translation of a medical research article into lay language for the 
non-specialist, first test to ensure that this will do no harm. This will take 
a lot of time, and will require the involvement of many specialists in medicine.


best,


Heather Morrison



From: goal-boun...@eprints.org  on behalf of Donald 
Samulack - Editage 
Sent: Thursday, July 12, 2018 4:03 PM
To: 'Global Open Access List (Successor of AmSci)'
Subject: Re: [GOAL] Why translating all scholarly knowledge for non-specialists 
using AI is complicated


Yes, but you have to start somewhere!



There is a quote out there (whether accurate or not) that if Henry Ford had 
asked his customers what they wanted, they would have asked for a faster horse. 
Who would ever have thought of a self-driving car, or even a flying car… well, 
many, actually – and they made it happen!



My point is that you have no idea what an exercise of this manner will spin off 
as a result of the effort – that is why it is called “research”. The goal is a 
lofty one, but there will be huge wins in scientific language AI along the way. 
Who knows, it may be necessary for multi-year journeys for lay-person trips to 
Mars, if something goes wrong with the spaceship along the way (communication 
delays will be prohibitive to effect any value from Earth; AI will be required 
for local support).





Cheers,

Don



-



Donald Samulack, PhD

President, U.S. Operations

Cactus Communications, Inc.

Editage, a division of Cactus Communications





From: goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf Of 
Heather Morrison
Sent: Thursday, July 12, 2018 1:49 PM
To: Global Open Access List (Successor of AmSci) 
Subject: [GOAL] Why translating all scholarly knowledge for non-specialists 
using AI is complicated



On July 10 Jason Priem wrote about the AI-powered systems "that help explain 
and contextualize articles, providing concept maps, automated plain-language 
translations"... that are part of his project's plan to develop a scholarly 
search engine aimed at a nonspecialist audience. The full post is available 
here:

http://mailman.ecs.soton.ac.uk/pipermail/goal/2018-July/004890.html



We share the goal of making all of the world's knowledge available to everyone 
without restriction, and I agree that reducing the conceptual barrier for the 
reader is a laudable goal. However, I think it is important to avoid 
underestimating the size of this challenge and potential for serious problems 
to arise. Two factors to consider: the current state of AI, and the conceptual 
challenges of assessing the validity of automated plain-language translations 
of scholarly works.



Current state of AI - a few recent examples of the current status of AI:



Vincent, J. (2016). Twitter taught Microsoft's AI chatbot to be a racist 
asshole in less than a day. The verge.

https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist



Wong, J. (2018). Amazon working to fix Alexa after users report bursts of 
'creepy' laughter. The Guardian 
https://www.theguardian.com/technology/2018/mar/07/amazon-alexa-random-creepy-laughter-company-fixing

Meyer, M. (2018). Google should have thought about Duplex's ethical issues 
before showing it off. Fortune 
http://fortune.com/2018/05/11/google-duplex-virtual-assistant-ethical-issues-ai-machine-learning/



Quote from Meyer:

As prominent sociologist Zeynep Tufekci put 
it<https://twitter.com/zeynep/status/994233568359575552>: “Google Assistant 
making calls pretending to be human not only without disclosing that it’s a 
bot, but adding ‘ummm’ and ‘aaah’ to deceive the human on the other end with 
the room cheering it… horrifying. Silicon Valley is ethically lost, rudderless 
and has not learned a thing.”



These early instances of AI applications involve the automation of relatively 
simple, repetitive tasks. According to Amazon, "Echo and other Alexa devices 
let you instantly connect to Alexa to play music, control your smart home, get 
information, news, weather, and more using just your voice". This is voice to 
text translation software that lets users speak to their computers instead of 
using keystrokes. Google's Duplex demonstration is a robot dialing a restaurant 
to make a dinner reservation.



Translating scholarly knowledge into simple plain text so that everyone can 
understand it is a lot more complicated,

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

2018-07-12 Thread Donald Samulack - Editage
Yes, but you have to start somewhere!

 

There is a quote out there (whether accurate or not) that if Henry Ford had
asked his customers what they wanted, they would have asked for a faster
horse. Who would ever have thought of a self-driving car, or even a flying
car… well, many, actually – and they made it happen!

 

My point is that you have no idea what an exercise of this manner will spin
off as a result of the effort – that is why it is called “research”. The
goal is a lofty one, but there will be huge wins in scientific language AI
along the way. Who knows, it may be necessary for multi-year journeys for
lay-person trips to Mars, if something goes wrong with the spaceship along
the way (communication delays will be prohibitive to effect any value from
Earth; AI will be required for local support).

 

 

Cheers,


Don

 

-

 

Donald Samulack, PhD

President, U.S. Operations

Cactus Communications, Inc.

Editage, a division of Cactus Communications

 

 

From: goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf
Of Heather Morrison
Sent: Thursday, July 12, 2018 1:49 PM
To: Global Open Access List (Successor of AmSci) 
Subject: [GOAL] Why translating all scholarly knowledge for non-specialists
using AI is complicated

 

On July 10 Jason Priem wrote about the AI-powered systems "that help explain
and contextualize articles, providing concept maps, automated plain-language
translations"... that are part of his project's plan to develop a scholarly
search engine aimed at a nonspecialist audience. The full post is available
here:

http://mailman.ecs.soton.ac.uk/pipermail/goal/2018-July/004890.html 

 

We share the goal of making all of the world's knowledge available to
everyone without restriction, and I agree that reducing the conceptual
barrier for the reader is a laudable goal. However, I think it is important
to avoid underestimating the size of this challenge and potential for
serious problems to arise. Two factors to consider: the current state of AI,
and the conceptual challenges of assessing the validity of automated
plain-language translations of scholarly works.

 

Current state of AI - a few recent examples of the current status of AI:

 

Vincent, J. (2016). Twitter taught Microsoft's AI chatbot to be a racist
asshole in less than a day. The verge. 

https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist 

 

Wong, J. (2018). Amazon working to fix Alexa after users report bursts of
'creepy' laughter. The Guardian
https://www.theguardian.com/technology/2018/mar/07/amazon-alexa-random-creep
y-laughter-company-fixing

Meyer, M. (2018). Google should have thought about Duplex's ethical issues
before showing it off. Fortune
http://fortune.com/2018/05/11/google-duplex-virtual-assistant-ethical-issues
-ai-machine-learning/

 

Quote from Meyer:  

As prominent sociologist Zeynep Tufekci put it
 : “Google Assistant
making calls pretending to be human not only without disclosing that it’s a
bot, but adding ‘ummm’ and ‘aaah’ to deceive the human on the other end with
the room cheering it… horrifying. Silicon Valley is ethically lost,
rudderless and has not learned a thing.”

 

These early instances of AI applications involve the automation of
relatively simple, repetitive tasks. According to Amazon, "Echo and other
Alexa devices let you instantly connect to Alexa to play music, control your
smart home, get information, news, weather, and more using just your voice".
This is voice to text translation software that lets users speak to their
computers instead of using keystrokes. Google's Duplex demonstration is a
robot dialing a restaurant to make a dinner reservation. 

 

Translating scholarly knowledge into simple plain text so that everyone can
understand it is a lot more complicated, with the degree of complexity
depending on the area of research. Some research in education or public
policy might be relatively easy to translate. In other areas, articles are
written for an expert audience that is assumed to have spent decades
acquiring a basic knowledge in a discipline. It is not clear to me that it
is even possible to explain advanced concepts to a non-specialist audience
without first developing a conceptual progression. 

 

Assessing the accuracy and appropriateness of a plain-text translation of a
scholarly work intended for a non-specialist audience requires expert
understanding of the work and thoughtful understanding of the potential for
misunderstandings that could arise. For example, I have never studied
physics. I looked at an automated plain-language translation of a physics
text I would have no means of assessing whether the translation was accurate
or not. I do understand enough medical terminology, scientific and medical
research methods to read medical articles and would have some idea if a
plain-text translation was accurate. However, I have never worked as a
h