Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-11-08 Thread Stefano Stabellini
On Thu, 7 Nov 2019, Lars Kurth wrote:
> Hi all,
> 
> I have received informal advice
> 
> On 21/10/2019, 06:54, "Artem Mygaiev"  wrote:
> 
> >  Before we ask Xen FuSA contributors to invest in documentation to
> > be presented as legally-valid evidence for certification, we should
> > ask a certified lawyer for their formal opinion on the validity of:
> > 
> >   (a) applying a source code license (BSD) to documentation
> > 
> > There are also BSD documentation license variants which may be worth
> > looking at
> 
> There is no LEGAL issue with using a source code license for documentation
> Typically, community issues arise when the license is has a patent clause
> which would act as a possible barrier to contributing to the docs (which 
> should be low)
> 
> >   (b) moving text bidirectionally between source code (BSD) and
> > documentation (any license)
> >   (c) moving text bidirectionally between source code (BSD) and
> > documentation (CC0)
> > 
> > I will raise this at the next SIG meeting
> 
> Fundamentally, you can’t move copyrightable content from any CC-BY-4/CC0 to 
> BSD and vice versa without going through the process of changing a license
> 
> On the community call we discussed Andy's sphinx-docs. Andy made a strong 
> case to keep the docset as CC-BY-4
> It rests on the assumption that user docs will always be different from 
> what's in code and thus there is no need to move anything which is 
> copyrightable between code and the docs
> Should that turn out to be wrong, there is still always the possibility of a 
> mixed CC-BY-4 / BSD-2-Clause docset in future
> So we are not painting ourselves into a corner
> 
> Regarding safety related docs, we discussed
> * CC-BY-4 => this is likely to be problematic as many docs are coupled 
> closely with source
> * Dual CC-BY-4 / BSD-2-Clause licensing does not solve this problem
> * BSD-2-Clause docs would enable docs that 
> 
> Thus, the most sensible approach for safety related docs would be to use a 
> BSD-2-Clause license uniformly in that case

I agree with you.

But at that point for simplicity, wouldn't it be better to use BSD-2 for
all docs?

It is difficult to be able to distinguish between "normal docs" and
"safety docs" in all cases. For instance, a description of the Xen
command line options would be required for safety, but might already
exist as docs under CC-BY-4.

What's the advantage with having some docs CC-BY-4, when we need to have
some other docs BSD-2?

(As you know, I don't care about the specific license, I am only trying
to make our life easier.)___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-11-07 Thread Lars Kurth
Hi all,

I have received informal advice

On 21/10/2019, 06:54, "Artem Mygaiev"  wrote:

>  Before we ask Xen FuSA contributors to invest in documentation to
> be presented as legally-valid evidence for certification, we should
> ask a certified lawyer for their formal opinion on the validity of:
> 
>   (a) applying a source code license (BSD) to documentation
> 
> There are also BSD documentation license variants which may be worth
> looking at

There is no LEGAL issue with using a source code license for documentation
Typically, community issues arise when the license is has a patent clause
which would act as a possible barrier to contributing to the docs (which should 
be low)

>   (b) moving text bidirectionally between source code (BSD) and
> documentation (any license)
>   (c) moving text bidirectionally between source code (BSD) and
> documentation (CC0)
> 
> I will raise this at the next SIG meeting

Fundamentally, you can’t move copyrightable content from any CC-BY-4/CC0 to BSD 
and vice versa without going through the process of changing a license

On the community call we discussed Andy's sphinx-docs. Andy made a strong case 
to keep the docset as CC-BY-4
It rests on the assumption that user docs will always be different from what's 
in code and thus there is no need to move anything which is copyrightable 
between code and the docs
Should that turn out to be wrong, there is still always the possibility of a 
mixed CC-BY-4 / BSD-2-Clause docset in future
So we are not painting ourselves into a corner

Regarding safety related docs, we discussed
* CC-BY-4 => this is likely to be problematic as many docs are coupled closely 
with source
* Dual CC-BY-4 / BSD-2-Clause licensing does not solve this problem
* BSD-2-Clause docs would enable docs that 

Thus, the most sensible approach for safety related docs would be to use a 
BSD-2-Clause license uniformly in that case

Regards
Lars

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-21 Thread Artem Mygaiev
Hi Lars

On Thu, 2019-10-17 at 17:30 +, Lars Kurth wrote:
> 
> On 17/10/2019, 18:05, "Rich Persaud" <
> pers...@gmail.com
> > wrote:
> 
> On Oct 17, 2019, at 12:55, Stefano Stabellini <
> sstabell...@kernel.org
> > wrote:
> > 
> > On Thu, 17 Oct 2019, Rich Persaud wrote:
> >>> On Oct 17, 2019, at 12:32, Stefano Stabellini <
> sstabell...@kernel.org
> > wrote:
> >>> 
> >>> On Thu, 17 Oct 2019, Lars Kurth wrote:
>  On 16/10/2019, 17:35, "Rich Persaud" <
> pers...@gmail.com
> > wrote:
>  
> >> On Oct 15, 2019, at 08:27, Lars Kurth <
> lars.kurth@gmail.com
> > wrote:
> > ...
> > 
> > My point really was is that due to storing the files in
> git, we essentially do NOT today do this.
> > So we would need to take extra action: e.g. manually or
> through tooling
> > 
> >>> 4.2: We could require individual authors to be credited:
> in that
> >>> case we probably ought to lead by example and
> list the authors
> >>> in a credit/license section and extract the
> information from
> >>> git logs when we generate it (at some point in
> the future)
> >>> 5: You give an indication whether you made changes ... in
> practice
> >>> this means you have to state significant changes made to
> the works
> >> 
> >> This is also helpful for provenance of changes, which is
> relevant in safety-oriented documentation.  It can be used to clearly
> delineate CC-licensed content (which may be reused by many companies)
> from "All Rights Reserved" commercial content that may be added for a
> specific commercial audience or purpose.
> > 
> > I agree
> > 
> > I think the outcome of this analysis is really that the
> only significant difference between BSD and CC-BY in this context is
> the  "All Rights Reserved" portion
>  
>    Also - BSD is a "software" license while CC-BY is a
> "content" license, so they are not strictly comparable, even if they
> use similar terminology.
>  
>  True, but as we have noticed the boundary between content
> and in-code docs content is fuzzy.
>  
> >> There is a difference between "software" which "runs on
> machines" and "documentation" which "runs on humans".  Combined
> software (e.g. BSD code from two origins) is executed identically,
> despite origin.  Humans make value judgements based on the
> author/origin of content, hence the focus on attribution.  Yes, there
> is a provenance graph in git (software/data), but that's not
> typically visible to human readers, except as a generated report,
> i.e. documentation.
> > 
> > Yes true. But also true for CC-BY-4 sources stored in git
> unless extra action is taken 
> > 
> > But my point is: 
> > * If we take extra action as e.g. proposed in 4.2 we can
> apply this uniformly to BSD as well as CC-BY pages
> > * We can add a section on re-use as proposed in 4.2 which
> recommends best practices around 5.  
> > * We can highlight sections that are BSD vs CC-BY in such a
> section, such that someone who has issue can remove these easily
> > 
> > In addition to these points: maybe it is too impractical to
> create ABI documentation based on CC-BY-4 (given that a lot of what
> we need is already in BSD sources). 
> > We could just copy some of the content in the BSD sources
> to new CC-BY-4 sources, but in practice it would just be hiding the
> potential legal issues behind it. 
> > Someone could contest the creation and argue that portions
> of the now CC-BY-4 sources are in fact BSD: in practice this is
> extremely unlikely, but it is possible.
> > 
> >>> As such, BSD-2/3-Clause in our context works similarly to
> CC-BY-4
> >>> from a downstream's perspective. In fact CC-BY-4 is
> somewhat stricter
> >> 
> >> If we don't want the incentives and provenance properties
> of CC-BY, there is the option of CC0, which is the equivalent of
> public domain.  This would delegate the task of separating commercial
> vs CC content to each reader, without any license-required
> attribution or separation.
> >> 
> >> Some background on licenses designed for documentation,
> which has different legal requirements than software:
> >> 
> >> 
> https://urldefense.com/v3/__https://www.dreamsongs.com/IHE/IHE-50.html__;!K6dmGCEab4ueJg!kzVGzaxQSxR-63kIKCKdKg9tpj03tZTi7-WJQ4Jv0YsIdRVLFr8VUmElp-msVq7CLg$
>  
> >> 
> https://urldefense.com/v3/__https://creativecommons.org/faq/*what-are-creative-commons-licenses__;Iw!K6dmGCEab4ueJg!kzVGzaxQSxR-63kIKCKdKg9tpj03tZTi7-WJQ4Jv0YsIdRVLFr8VUmElp-lzx1cSwA$
>   (not for s/w)
> > 
> > I will have a look. But the core issue - which is why I
> have proposed what I have - is the question on how practically 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-17 Thread Lars Kurth


On 17/10/2019, 18:05, "Rich Persaud"  wrote:

On Oct 17, 2019, at 12:55, Stefano Stabellini  
wrote:
> 
> On Thu, 17 Oct 2019, Rich Persaud wrote:
>>> On Oct 17, 2019, at 12:32, Stefano Stabellini  
wrote:
>>> 
>>> On Thu, 17 Oct 2019, Lars Kurth wrote:
 On 16/10/2019, 17:35, "Rich Persaud"  wrote:
 
>> On Oct 15, 2019, at 08:27, Lars Kurth  
wrote:
> ...
> 
> My point really was is that due to storing the files in git, we 
essentially do NOT today do this.
> So we would need to take extra action: e.g. manually or through 
tooling
> 
>>> 4.2: We could require individual authors to be credited: in that
>>> case we probably ought to lead by example and list the 
authors
>>> in a credit/license section and extract the information from
>>> git logs when we generate it (at some point in the future)
>>> 5: You give an indication whether you made changes ... in practice
>>> this means you have to state significant changes made to the works
>> 
>> This is also helpful for provenance of changes, which is relevant in 
safety-oriented documentation.  It can be used to clearly delineate CC-licensed 
content (which may be reused by many companies) from "All Rights Reserved" 
commercial content that may be added for a specific commercial audience or 
purpose.
> 
> I agree
> 
> I think the outcome of this analysis is really that the only 
significant difference between BSD and CC-BY in this context is the  "All 
Rights Reserved" portion
 
   Also - BSD is a "software" license while CC-BY is a "content" 
license, so they are not strictly comparable, even if they use similar 
terminology.
 
 True, but as we have noticed the boundary between content and in-code 
docs content is fuzzy.
 
>> There is a difference between "software" which "runs on machines" 
and "documentation" which "runs on humans".  Combined software (e.g. BSD code 
from two origins) is executed identically, despite origin.  Humans make value 
judgements based on the author/origin of content, hence the focus on 
attribution.  Yes, there is a provenance graph in git (software/data), but 
that's not typically visible to human readers, except as a generated report, 
i.e. documentation.
> 
> Yes true. But also true for CC-BY-4 sources stored in git unless 
extra action is taken 
> 
> But my point is: 
> * If we take extra action as e.g. proposed in 4.2 we can apply this 
uniformly to BSD as well as CC-BY pages
> * We can add a section on re-use as proposed in 4.2 which recommends 
best practices around 5.  
> * We can highlight sections that are BSD vs CC-BY in such a section, 
such that someone who has issue can remove these easily
> 
> In addition to these points: maybe it is too impractical to create 
ABI documentation based on CC-BY-4 (given that a lot of what we need is already 
in BSD sources). 
> We could just copy some of the content in the BSD sources to new 
CC-BY-4 sources, but in practice it would just be hiding the potential legal 
issues behind it. 
> Someone could contest the creation and argue that portions of the now 
CC-BY-4 sources are in fact BSD: in practice this is extremely unlikely, but it 
is possible.
> 
>>> As such, BSD-2/3-Clause in our context works similarly to CC-BY-4
>>> from a downstream's perspective. In fact CC-BY-4 is somewhat 
stricter
>> 
>> If we don't want the incentives and provenance properties of CC-BY, 
there is the option of CC0, which is the equivalent of public domain.  This 
would delegate the task of separating commercial vs CC content to each reader, 
without any license-required attribution or separation.
>> 
>> Some background on licenses designed for documentation, which has 
different legal requirements than software:
>> 
>> https://www.dreamsongs.com/IHE/IHE-50.html
>> https://creativecommons.org/faq/#what-are-creative-commons-licenses 
(not for s/w)
> 
> I will have a look. But the core issue - which is why I have proposed 
what I have - is the question on how practically ABI documentation published 
under CC-BY-4, when much of this information has already been published in the 
past as code under BSD.
 
   Is there a reference sample of:
 
   - previously published, BSD-licensed, ABI 
specification-as-source-code
 
 All of http://xenbits.xen.org/docs/unstable/hypercall
 And some can be content rich as seen in 
http://xenbits.xen.org/docs/unstable/hypercall/arm/include,public,xen.h.html#Func_HYPERVISOR_mmu_update
 
   - the corresponding FuSA ABI documentation for that source file
 
 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-17 Thread Rich Persaud
On Oct 17, 2019, at 12:55, Stefano Stabellini  wrote:
> 
> On Thu, 17 Oct 2019, Rich Persaud wrote:
>>> On Oct 17, 2019, at 12:32, Stefano Stabellini  
>>> wrote:
>>> 
>>> On Thu, 17 Oct 2019, Lars Kurth wrote:
 On 16/10/2019, 17:35, "Rich Persaud"  wrote:
 
>> On Oct 15, 2019, at 08:27, Lars Kurth  wrote:
> ...
> 
> My point really was is that due to storing the files in git, we 
> essentially do NOT today do this.
> So we would need to take extra action: e.g. manually or through tooling
> 
>>> 4.2: We could require individual authors to be credited: in that
>>> case we probably ought to lead by example and list the authors
>>> in a credit/license section and extract the information from
>>> git logs when we generate it (at some point in the future)
>>> 5: You give an indication whether you made changes ... in practice
>>> this means you have to state significant changes made to the works
>> 
>> This is also helpful for provenance of changes, which is relevant in 
>> safety-oriented documentation.  It can be used to clearly delineate 
>> CC-licensed content (which may be reused by many companies) from "All 
>> Rights Reserved" commercial content that may be added for a specific 
>> commercial audience or purpose.
> 
> I agree
> 
> I think the outcome of this analysis is really that the only significant 
> difference between BSD and CC-BY in this context is the  "All Rights 
> Reserved" portion
 
   Also - BSD is a "software" license while CC-BY is a "content" license, 
 so they are not strictly comparable, even if they use similar terminology.
 
 True, but as we have noticed the boundary between content and in-code docs 
 content is fuzzy.
 
>> There is a difference between "software" which "runs on machines" and 
>> "documentation" which "runs on humans".  Combined software (e.g. BSD 
>> code from two origins) is executed identically, despite origin.  Humans 
>> make value judgements based on the author/origin of content, hence the 
>> focus on attribution.  Yes, there is a provenance graph in git 
>> (software/data), but that's not typically visible to human readers, 
>> except as a generated report, i.e. documentation.
> 
> Yes true. But also true for CC-BY-4 sources stored in git unless extra 
> action is taken 
> 
> But my point is: 
> * If we take extra action as e.g. proposed in 4.2 we can apply this 
> uniformly to BSD as well as CC-BY pages
> * We can add a section on re-use as proposed in 4.2 which recommends best 
> practices around 5.  
> * We can highlight sections that are BSD vs CC-BY in such a section, such 
> that someone who has issue can remove these easily
> 
> In addition to these points: maybe it is too impractical to create ABI 
> documentation based on CC-BY-4 (given that a lot of what we need is 
> already in BSD sources). 
> We could just copy some of the content in the BSD sources to new CC-BY-4 
> sources, but in practice it would just be hiding the potential legal 
> issues behind it. 
> Someone could contest the creation and argue that portions of the now 
> CC-BY-4 sources are in fact BSD: in practice this is extremely unlikely, 
> but it is possible.
> 
>>> As such, BSD-2/3-Clause in our context works similarly to CC-BY-4
>>> from a downstream's perspective. In fact CC-BY-4 is somewhat stricter
>> 
>> If we don't want the incentives and provenance properties of CC-BY, 
>> there is the option of CC0, which is the equivalent of public domain.  
>> This would delegate the task of separating commercial vs CC content to 
>> each reader, without any license-required attribution or separation.
>> 
>> Some background on licenses designed for documentation, which has 
>> different legal requirements than software:
>> 
>> https://www.dreamsongs.com/IHE/IHE-50.html
>> https://creativecommons.org/faq/#what-are-creative-commons-licenses (not 
>> for s/w)
> 
> I will have a look. But the core issue - which is why I have proposed 
> what I have - is the question on how practically ABI documentation 
> published under CC-BY-4, when much of this information has already been 
> published in the past as code under BSD.
 
   Is there a reference sample of:
 
   - previously published, BSD-licensed, ABI specification-as-source-code
 
 All of http://xenbits.xen.org/docs/unstable/hypercall
 And some can be content rich as seen in 
 http://xenbits.xen.org/docs/unstable/hypercall/arm/include,public,xen.h.html#Func_HYPERVISOR_mmu_update
 
   - the corresponding FuSA ABI documentation for that source file
 
 We do NOT have ANY FuSA documentation at this stage. And there are NO 
 examples of such docs 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-17 Thread Stefano Stabellini
On Thu, 17 Oct 2019, Rich Persaud wrote:
> On Oct 17, 2019, at 12:32, Stefano Stabellini  wrote:
> > 
> > On Thu, 17 Oct 2019, Lars Kurth wrote:
> >> On 16/10/2019, 17:35, "Rich Persaud"  wrote:
> >> 
>  On Oct 15, 2019, at 08:27, Lars Kurth  wrote:
> >>> ...
> >>> 
> >>> My point really was is that due to storing the files in git, we 
> >>> essentially do NOT today do this.
> >>> So we would need to take extra action: e.g. manually or through tooling
> >>> 
> >  4.2: We could require individual authors to be credited: in that
> >  case we probably ought to lead by example and list the authors
> >  in a credit/license section and extract the information from
> >  git logs when we generate it (at some point in the future)
> > 5: You give an indication whether you made changes ... in practice
> > this means you have to state significant changes made to the works
>  
>  This is also helpful for provenance of changes, which is relevant in 
>  safety-oriented documentation.  It can be used to clearly delineate 
>  CC-licensed content (which may be reused by many companies) from "All 
>  Rights Reserved" commercial content that may be added for a specific 
>  commercial audience or purpose.
> >>> 
> >>> I agree
> >>> 
> >>> I think the outcome of this analysis is really that the only significant 
> >>> difference between BSD and CC-BY in this context is the  "All Rights 
> >>> Reserved" portion
> >> 
> >>Also - BSD is a "software" license while CC-BY is a "content" license, 
> >> so they are not strictly comparable, even if they use similar terminology.
> >> 
> >> True, but as we have noticed the boundary between content and in-code docs 
> >> content is fuzzy.
> >> 
>  There is a difference between "software" which "runs on machines" and 
>  "documentation" which "runs on humans".  Combined software (e.g. BSD 
>  code from two origins) is executed identically, despite origin.  Humans 
>  make value judgements based on the author/origin of content, hence the 
>  focus on attribution.  Yes, there is a provenance graph in git 
>  (software/data), but that's not typically visible to human readers, 
>  except as a generated report, i.e. documentation.
> >>> 
> >>> Yes true. But also true for CC-BY-4 sources stored in git unless extra 
> >>> action is taken 
> >>> 
> >>> But my point is: 
> >>> * If we take extra action as e.g. proposed in 4.2 we can apply this 
> >>> uniformly to BSD as well as CC-BY pages
> >>> * We can add a section on re-use as proposed in 4.2 which recommends best 
> >>> practices around 5.  
> >>> * We can highlight sections that are BSD vs CC-BY in such a section, such 
> >>> that someone who has issue can remove these easily
> >>> 
> >>> In addition to these points: maybe it is too impractical to create ABI 
> >>> documentation based on CC-BY-4 (given that a lot of what we need is 
> >>> already in BSD sources). 
> >>> We could just copy some of the content in the BSD sources to new CC-BY-4 
> >>> sources, but in practice it would just be hiding the potential legal 
> >>> issues behind it. 
> >>> Someone could contest the creation and argue that portions of the now 
> >>> CC-BY-4 sources are in fact BSD: in practice this is extremely unlikely, 
> >>> but it is possible.
> >>> 
> > As such, BSD-2/3-Clause in our context works similarly to CC-BY-4
> > from a downstream's perspective. In fact CC-BY-4 is somewhat stricter
>  
>  If we don't want the incentives and provenance properties of CC-BY, 
>  there is the option of CC0, which is the equivalent of public domain.  
>  This would delegate the task of separating commercial vs CC content to 
>  each reader, without any license-required attribution or separation.
>  
>  Some background on licenses designed for documentation, which has 
>  different legal requirements than software:
>  
>  https://www.dreamsongs.com/IHE/IHE-50.html
>  https://creativecommons.org/faq/#what-are-creative-commons-licenses (not 
>  for s/w)
> >>> 
> >>> I will have a look. But the core issue - which is why I have proposed 
> >>> what I have - is the question on how practically ABI documentation 
> >>> published under CC-BY-4, when much of this information has already been 
> >>> published in the past as code under BSD.
> >> 
> >>Is there a reference sample of:
> >> 
> >>- previously published, BSD-licensed, ABI specification-as-source-code
> >> 
> >> All of http://xenbits.xen.org/docs/unstable/hypercall
> >> And some can be content rich as seen in 
> >> http://xenbits.xen.org/docs/unstable/hypercall/arm/include,public,xen.h.html#Func_HYPERVISOR_mmu_update
> >> 
> >>- the corresponding FuSA ABI documentation for that source file
> >> 
> >> We do NOT have ANY FuSA documentation at this stage. And there are NO 
> >> examples of such docs in the public domain
> >> I am waiting for a sanitised 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-17 Thread Rich Persaud
On Oct 17, 2019, at 12:32, Stefano Stabellini  wrote:
> 
> On Thu, 17 Oct 2019, Lars Kurth wrote:
>> On 16/10/2019, 17:35, "Rich Persaud"  wrote:
>> 
 On Oct 15, 2019, at 08:27, Lars Kurth  wrote:
>>> ...
>>> 
>>> My point really was is that due to storing the files in git, we essentially 
>>> do NOT today do this.
>>> So we would need to take extra action: e.g. manually or through tooling
>>> 
>  4.2: We could require individual authors to be credited: in that
>  case we probably ought to lead by example and list the authors
>  in a credit/license section and extract the information from
>  git logs when we generate it (at some point in the future)
> 5: You give an indication whether you made changes ... in practice
> this means you have to state significant changes made to the works
 
 This is also helpful for provenance of changes, which is relevant in 
 safety-oriented documentation.  It can be used to clearly delineate 
 CC-licensed content (which may be reused by many companies) from "All 
 Rights Reserved" commercial content that may be added for a specific 
 commercial audience or purpose.
>>> 
>>> I agree
>>> 
>>> I think the outcome of this analysis is really that the only significant 
>>> difference between BSD and CC-BY in this context is the  "All Rights 
>>> Reserved" portion
>> 
>>Also - BSD is a "software" license while CC-BY is a "content" license, so 
>> they are not strictly comparable, even if they use similar terminology.
>> 
>> True, but as we have noticed the boundary between content and in-code docs 
>> content is fuzzy.
>> 
 There is a difference between "software" which "runs on machines" and 
 "documentation" which "runs on humans".  Combined software (e.g. BSD code 
 from two origins) is executed identically, despite origin.  Humans make 
 value judgements based on the author/origin of content, hence the focus on 
 attribution.  Yes, there is a provenance graph in git (software/data), but 
 that's not typically visible to human readers, except as a generated 
 report, i.e. documentation.
>>> 
>>> Yes true. But also true for CC-BY-4 sources stored in git unless extra 
>>> action is taken 
>>> 
>>> But my point is: 
>>> * If we take extra action as e.g. proposed in 4.2 we can apply this 
>>> uniformly to BSD as well as CC-BY pages
>>> * We can add a section on re-use as proposed in 4.2 which recommends best 
>>> practices around 5.  
>>> * We can highlight sections that are BSD vs CC-BY in such a section, such 
>>> that someone who has issue can remove these easily
>>> 
>>> In addition to these points: maybe it is too impractical to create ABI 
>>> documentation based on CC-BY-4 (given that a lot of what we need is already 
>>> in BSD sources). 
>>> We could just copy some of the content in the BSD sources to new CC-BY-4 
>>> sources, but in practice it would just be hiding the potential legal issues 
>>> behind it. 
>>> Someone could contest the creation and argue that portions of the now 
>>> CC-BY-4 sources are in fact BSD: in practice this is extremely unlikely, 
>>> but it is possible.
>>> 
> As such, BSD-2/3-Clause in our context works similarly to CC-BY-4
> from a downstream's perspective. In fact CC-BY-4 is somewhat stricter
 
 If we don't want the incentives and provenance properties of CC-BY, there 
 is the option of CC0, which is the equivalent of public domain.  This 
 would delegate the task of separating commercial vs CC content to each 
 reader, without any license-required attribution or separation.
 
 Some background on licenses designed for documentation, which has 
 different legal requirements than software:
 
 https://www.dreamsongs.com/IHE/IHE-50.html
 https://creativecommons.org/faq/#what-are-creative-commons-licenses (not 
 for s/w)
>>> 
>>> I will have a look. But the core issue - which is why I have proposed what 
>>> I have - is the question on how practically ABI documentation published 
>>> under CC-BY-4, when much of this information has already been published in 
>>> the past as code under BSD.
>> 
>>Is there a reference sample of:
>> 
>>- previously published, BSD-licensed, ABI specification-as-source-code
>> 
>> All of http://xenbits.xen.org/docs/unstable/hypercall
>> And some can be content rich as seen in 
>> http://xenbits.xen.org/docs/unstable/hypercall/arm/include,public,xen.h.html#Func_HYPERVISOR_mmu_update
>> 
>>- the corresponding FuSA ABI documentation for that source file
>> 
>> We do NOT have ANY FuSA documentation at this stage. And there are NO 
>> examples of such docs in the public domain
>> I am waiting for a sanitised smallish system software example to be made 
>> available, which should help us identify the practical implications 
>> However, ABI documentation would be part of it
>> 
>>If there is almost a 1:1 correspondence between ABI "docs" and 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-17 Thread Stefano Stabellini
On Thu, 17 Oct 2019, Lars Kurth wrote:
> On 16/10/2019, 17:35, "Rich Persaud"  wrote:
> 
> > On Oct 15, 2019, at 08:27, Lars Kurth  wrote:
> ...
> > 
> > My point really was is that due to storing the files in git, we 
> essentially do NOT today do this.
> > So we would need to take extra action: e.g. manually or through tooling
> > 
> >>>   4.2: We could require individual authors to be credited: in that
> >>>   case we probably ought to lead by example and list the 
> authors
> >>>   in a credit/license section and extract the information from
> >>>   git logs when we generate it (at some point in the future)
> >>> 5: You give an indication whether you made changes ... in practice
> >>> this means you have to state significant changes made to the works
> >> 
> >> This is also helpful for provenance of changes, which is relevant in 
> safety-oriented documentation.  It can be used to clearly delineate 
> CC-licensed content (which may be reused by many companies) from "All Rights 
> Reserved" commercial content that may be added for a specific commercial 
> audience or purpose.
> > 
> > I agree
> > 
> > I think the outcome of this analysis is really that the only 
> significant difference between BSD and CC-BY in this context is the  "All 
> Rights Reserved" portion
> 
> Also - BSD is a "software" license while CC-BY is a "content" license, so 
> they are not strictly comparable, even if they use similar terminology.
> 
> True, but as we have noticed the boundary between content and in-code docs 
> content is fuzzy.
> 
> >> There is a difference between "software" which "runs on machines" and 
> "documentation" which "runs on humans".  Combined software (e.g. BSD code 
> from two origins) is executed identically, despite origin.  Humans make value 
> judgements based on the author/origin of content, hence the focus on 
> attribution.  Yes, there is a provenance graph in git (software/data), but 
> that's not typically visible to human readers, except as a generated report, 
> i.e. documentation.
> > 
> > Yes true. But also true for CC-BY-4 sources stored in git unless extra 
> action is taken 
> > 
> > But my point is: 
> > * If we take extra action as e.g. proposed in 4.2 we can apply this 
> uniformly to BSD as well as CC-BY pages
> > * We can add a section on re-use as proposed in 4.2 which recommends 
> best practices around 5.  
> > * We can highlight sections that are BSD vs CC-BY in such a section, 
> such that someone who has issue can remove these easily
> > 
> > In addition to these points: maybe it is too impractical to create ABI 
> documentation based on CC-BY-4 (given that a lot of what we need is already 
> in BSD sources). 
> > We could just copy some of the content in the BSD sources to new 
> CC-BY-4 sources, but in practice it would just be hiding the potential legal 
> issues behind it. 
> > Someone could contest the creation and argue that portions of the now 
> CC-BY-4 sources are in fact BSD: in practice this is extremely unlikely, but 
> it is possible.
> > 
> >>> As such, BSD-2/3-Clause in our context works similarly to CC-BY-4
> >>> from a downstream's perspective. In fact CC-BY-4 is somewhat stricter
> >> 
> >> If we don't want the incentives and provenance properties of CC-BY, 
> there is the option of CC0, which is the equivalent of public domain.  This 
> would delegate the task of separating commercial vs CC content to each 
> reader, without any license-required attribution or separation.
> >> 
> >> Some background on licenses designed for documentation, which has 
> different legal requirements than software:
> >> 
> >> https://www.dreamsongs.com/IHE/IHE-50.html
> >> https://creativecommons.org/faq/#what-are-creative-commons-licenses 
> (not for s/w)
> > 
> > I will have a look. But the core issue - which is why I have proposed 
> what I have - is the question on how practically ABI documentation published 
> under CC-BY-4, when much of this information has already been published in 
> the past as code under BSD.
> 
> Is there a reference sample of:
> 
> - previously published, BSD-licensed, ABI specification-as-source-code
> 
> All of http://xenbits.xen.org/docs/unstable/hypercall
> And some can be content rich as seen in 
> http://xenbits.xen.org/docs/unstable/hypercall/arm/include,public,xen.h.html#Func_HYPERVISOR_mmu_update
>  
> - the corresponding FuSA ABI documentation for that source file
> 
> We do NOT have ANY FuSA documentation at this stage. And there are NO 
> examples of such docs in the public domain
> I am waiting for a sanitised smallish system software example to be made 
> available, which should help us identify the practical implications 
> However, ABI documentation would be part of it
> 
> If there is almost a 1:1 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-17 Thread Lars Kurth


On 16/10/2019, 17:35, "Rich Persaud"  wrote:

> On Oct 15, 2019, at 08:27, Lars Kurth  wrote:
...
> 
> My point really was is that due to storing the files in git, we 
essentially do NOT today do this.
> So we would need to take extra action: e.g. manually or through tooling
> 
>>>   4.2: We could require individual authors to be credited: in that
>>>   case we probably ought to lead by example and list the authors
>>>   in a credit/license section and extract the information from
>>>   git logs when we generate it (at some point in the future)
>>> 5: You give an indication whether you made changes ... in practice
>>> this means you have to state significant changes made to the works
>> 
>> This is also helpful for provenance of changes, which is relevant in 
safety-oriented documentation.  It can be used to clearly delineate CC-licensed 
content (which may be reused by many companies) from "All Rights Reserved" 
commercial content that may be added for a specific commercial audience or 
purpose.
> 
> I agree
> 
> I think the outcome of this analysis is really that the only significant 
difference between BSD and CC-BY in this context is the  "All Rights Reserved" 
portion

Also - BSD is a "software" license while CC-BY is a "content" license, so 
they are not strictly comparable, even if they use similar terminology.

True, but as we have noticed the boundary between content and in-code docs 
content is fuzzy.

>> There is a difference between "software" which "runs on machines" and 
"documentation" which "runs on humans".  Combined software (e.g. BSD code from 
two origins) is executed identically, despite origin.  Humans make value 
judgements based on the author/origin of content, hence the focus on 
attribution.  Yes, there is a provenance graph in git (software/data), but 
that's not typically visible to human readers, except as a generated report, 
i.e. documentation.
> 
> Yes true. But also true for CC-BY-4 sources stored in git unless extra 
action is taken 
> 
> But my point is: 
> * If we take extra action as e.g. proposed in 4.2 we can apply this 
uniformly to BSD as well as CC-BY pages
> * We can add a section on re-use as proposed in 4.2 which recommends best 
practices around 5.  
> * We can highlight sections that are BSD vs CC-BY in such a section, such 
that someone who has issue can remove these easily
> 
> In addition to these points: maybe it is too impractical to create ABI 
documentation based on CC-BY-4 (given that a lot of what we need is already in 
BSD sources). 
> We could just copy some of the content in the BSD sources to new CC-BY-4 
sources, but in practice it would just be hiding the potential legal issues 
behind it. 
> Someone could contest the creation and argue that portions of the now 
CC-BY-4 sources are in fact BSD: in practice this is extremely unlikely, but it 
is possible.
> 
>>> As such, BSD-2/3-Clause in our context works similarly to CC-BY-4
>>> from a downstream's perspective. In fact CC-BY-4 is somewhat stricter
>> 
>> If we don't want the incentives and provenance properties of CC-BY, 
there is the option of CC0, which is the equivalent of public domain.  This 
would delegate the task of separating commercial vs CC content to each reader, 
without any license-required attribution or separation.
>> 
>> Some background on licenses designed for documentation, which has 
different legal requirements than software:
>> 
>> https://www.dreamsongs.com/IHE/IHE-50.html
>> https://creativecommons.org/faq/#what-are-creative-commons-licenses (not 
for s/w)
> 
> I will have a look. But the core issue - which is why I have proposed 
what I have - is the question on how practically ABI documentation published 
under CC-BY-4, when much of this information has already been published in the 
past as code under BSD.

Is there a reference sample of:

- previously published, BSD-licensed, ABI specification-as-source-code

All of http://xenbits.xen.org/docs/unstable/hypercall
And some can be content rich as seen in 
http://xenbits.xen.org/docs/unstable/hypercall/arm/include,public,xen.h.html#Func_HYPERVISOR_mmu_update
 
- the corresponding FuSA ABI documentation for that source file

We do NOT have ANY FuSA documentation at this stage. And there are NO examples 
of such docs in the public domain
I am waiting for a sanitised smallish system software example to be made 
available, which should help us identify the practical implications 
However, ABI documentation would be part of it

If there is almost a 1:1 correspondence between ABI "docs" and "code", 
could the necessary FuSA annotations become part of the source code file, e.g. 
comments or tags?  Or is there a requirement for the ABI documentation to have 
a specific layout in a printable report?
   

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-16 Thread Rich Persaud
> On Oct 15, 2019, at 08:27, Lars Kurth  wrote:
> Hi Rich,
> 
 On 15 Oct 2019, at 02:58, Rich Persaud  wrote:
> On Oct 11, 2019, at 07:11, Lars Kurth  wrote:
>>> On 11/10/2019, 02:24, "Stefano Stabellini"  wrote:
   On Thu, 10 Oct 2019, Lars Kurth wrote:
 @Stefano: as you and I believe Brian will be spending time on improving the
 ABI docs, I think we need to build some agreement here on what/how
 to do it. I was assuming that generally the consensus was to have
 docs close to the code in source, but this does not seem to be the case.
 But if we do have stuff separately, ideally we would have a tool that helps
 point people editing headers to also look at the relevant docs. Otherwise 
 it will
 be hard to keep them in sync.
>>>   In general, it is a good idea to keep the docs close to the code to make
>>>   it easier to keep them up to date. But there is no one-size-fits-all
>>>   here. For public ABI descriptions, I agree with Andrew that ideally they
>>>   should not be defined as C header files.
>>>   But it is not an issue: any work that we do here won't be wasted. For
>>>   instance, we could start by adding more comments to the current header
>>>   files. Then, as a second step, take all the comments and turn them into
>>>   a proper ABI description document without any C function declarations.
>>>   It is easy to move English text around, as long as the license allows it
>>>   -- that is the only potential blocker I can see.
>>> This is likely to be problematic. First of all, we are talking about 
>>> BSD-3-Clause
>>> or BSD-2-Clause code (the latter is more dominant in headers I believe) in
>>> all known cases.
>>> The main properties of the BSD are
>>> 1: Can be pretty much used anywhere for any purpose
>>> 2: Can be modified for any purpose
>>> 3: But the original license header must be retained in derivates
>> 
>> This is equivalent to attribution of the copyright owner of the originally 
>> created file.
>> 
>>> Does *not* have requirements around attribution as CC-BY-4: however,
>>> as we store everything in git attribution is handled by us by default
>> 
>> See above, the license header attributes copyright, since BSD was created 
>> for "software" and people who work on "software" would typically be looking 
>> at source code, hence the primary attribution takes place there, with 
>> secondary attribution in EULAs, "About" panels, etc.
>> 
>>> CC-BY-4 also has properties 1-3
>>> In addition: it does require that
>>> 4: Derived works are giving appropriate credit to authors
>>>   We could clarify in a COPYING how we prefer to do this
>>>   4.1: We could say that "referring to the Xen Project community"
>>>   is sufficient to comply with the attribution clause
>> 
>> One motivation for CC-BY (with attribution) is to create an incentive 
>> (credit) for the creation of documentation, which is not commonly a favorite 
>> pastime of developers.   Credit typically goes at least to the original 
>> author of a section of documentation, with varying ways of crediting 
>> subsequent contributors.  The documentation can be structured to make 
>> crediting easier.  The mechanism for crediting can be designed to encourage 
>> specific outcomes, along our projected doc lifecycle for safety 
>> certification, contributors, evaluators and commercial investors.
> 
> My point really was is that due to storing the files in git, we essentially 
> do NOT today do this.
> So we would need to take extra action: e.g. manually or through tooling
> 
>>>   4.2: We could require individual authors to be credited: in that
>>>   case we probably ought to lead by example and list the authors
>>>   in a credit/license section and extract the information from
>>>   git logs when we generate it (at some point in the future)
>>> 5: You give an indication whether you made changes ... in practice
>>> this means you have to state significant changes made to the works
>> 
>> This is also helpful for provenance of changes, which is relevant in 
>> safety-oriented documentation.  It can be used to clearly delineate 
>> CC-licensed content (which may be reused by many companies) from "All Rights 
>> Reserved" commercial content that may be added for a specific commercial 
>> audience or purpose.
> 
> I agree
> 
> I think the outcome of this analysis is really that the only significant 
> difference between BSD and CC-BY in this context is the  "All Rights 
> Reserved" portion

Also - BSD is a "software" license while CC-BY is a "content" license, so they 
are not strictly comparable, even if they use similar terminology.

>> There is a difference between "software" which "runs on machines" and 
>> "documentation" which "runs on humans".  Combined software (e.g. BSD code 
>> from two origins) is executed identically, despite origin.  Humans make 
>> value judgements based on the author/origin of content, hence the focus on 
>> attribution.  Yes, there is 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-15 Thread Lars Kurth
Hi Rich,

> On 15 Oct 2019, at 02:58, Rich Persaud  wrote:
> 
>> On Oct 11, 2019, at 07:11, Lars Kurth  wrote:
>> 
>> On 11/10/2019, 02:24, "Stefano Stabellini"  wrote:
>> 
>>>On Thu, 10 Oct 2019, Lars Kurth wrote:
>>> 
>>> @Stefano: as you and I believe Brian will be spending time on improving the
>>> ABI docs, I think we need to build some agreement here on what/how
>>> to do it. I was assuming that generally the consensus was to have
>>> docs close to the code in source, but this does not seem to be the case.
>>> 
>>> But if we do have stuff separately, ideally we would have a tool that helps
>>> point people editing headers to also look at the relevant docs. Otherwise 
>>> it will
>>> be hard to keep them in sync.
>> 
>>In general, it is a good idea to keep the docs close to the code to make
>>it easier to keep them up to date. But there is no one-size-fits-all
>>here. For public ABI descriptions, I agree with Andrew that ideally they
>>should not be defined as C header files.
>> 
>>But it is not an issue: any work that we do here won't be wasted. For
>>instance, we could start by adding more comments to the current header
>>files. Then, as a second step, take all the comments and turn them into
>>a proper ABI description document without any C function declarations.
>>It is easy to move English text around, as long as the license allows it
>>-- that is the only potential blocker I can see.
>> 
>> This is likely to be problematic. First of all, we are talking about 
>> BSD-3-Clause
>> or BSD-2-Clause code (the latter is more dominant in headers I believe) in
>> all known cases.
>> 
>> The main properties of the BSD are
>> 1: Can be pretty much used anywhere for any purpose
>> 2: Can be modified for any purpose 
>> 3: But the original license header must be retained in derivates
> 
> This is equivalent to attribution of the copyright owner of the originally 
> created file.
> 
>> Does *not* have requirements around attribution as CC-BY-4: however,
>> as we store everything in git attribution is handled by us by default
> 
> See above, the license header attributes copyright, since BSD was created for 
> "software" and people who work on "software" would typically be looking at 
> source code, hence the primary attribution takes place there, with secondary 
> attribution in EULAs, "About" panels, etc.
> 
>> CC-BY-4 also has properties 1-3
>> In addition: it does require that 
>> 4: Derived works are giving appropriate credit to authors 
>>We could clarify in a COPYING how we prefer to do this
>>4.1: We could say that "referring to the Xen Project community" 
>>is sufficient to comply with the attribution clause
> 
> One motivation for CC-BY (with attribution) is to create an incentive 
> (credit) for the creation of documentation, which is not commonly a favorite 
> pastime of developers.   Credit typically goes at least to the original 
> author of a section of documentation, with varying ways of crediting 
> subsequent contributors.  The documentation can be structured to make 
> crediting easier.  The mechanism for crediting can be designed to encourage 
> specific outcomes, along our projected doc lifecycle for safety 
> certification, contributors, evaluators and commercial investors.

My point really was is that due to storing the files in git, we essentially do 
NOT today do this.
So we would need to take extra action: e.g. manually or through tooling

>>4.2: We could require individual authors to be credited: in that
>>case we probably ought to lead by example and list the authors
>>in a credit/license section and extract the information from
>>git logs when we generate it (at some point in the future)
>> 5: You give an indication whether you made changes ... in practice
>> this means you have to state significant changes made to the works
> 
> This is also helpful for provenance of changes, which is relevant in 
> safety-oriented documentation.  It can be used to clearly delineate 
> CC-licensed content (which may be reused by many companies) from "All Rights 
> Reserved" commercial content that may be added for a specific commercial 
> audience or purpose.

I agree

I think the outcome of this analysis is really that the only significant 
difference between BSD and CC-BY in this context is the  "All Rights Reserved" 
portion

> There is a difference between "software" which "runs on machines" and 
> "documentation" which "runs on humans".  Combined software (e.g. BSD code 
> from two origins) is executed identically, despite origin.  Humans make value 
> judgements based on the author/origin of content, hence the focus on 
> attribution.  Yes, there is a provenance graph in git (software/data), but 
> that's not typically visible to human readers, except as a generated report, 
> i.e. documentation.

Yes true. But also true for CC-BY-4 sources stored in git unless extra action 
is taken 


Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-14 Thread Rich Persaud
> On Oct 11, 2019, at 07:11, Lars Kurth  wrote:
> 
> On 11/10/2019, 02:24, "Stefano Stabellini"  wrote:
> 
>>On Thu, 10 Oct 2019, Lars Kurth wrote:
>> * Would we ever include API docs generated from GPLv2 code? E.g. for safety 
>> use-cases?
>> @Stefano, @Artem: I guess this one is for you. 
>> I suppose if we would have a similar issue for a safety manual
>> I am also assuming we would want to use sphinx docs and rst to generate a 
>> future safety manual
> 
>Hi Lars,
> 
>Thanks for putting this email together.
> 
>In terms of formats, I don't have a preference between rst and pandoc,
>but if we are going to use rst going forward, I'd say to try to use rst
>for everything, including converting all the old stuff. The fewer
>different formats, the better.
> 
> I think the proposal that needs to follow on from this (which would at some
> point need to be voted on) would then be to go for rst. 
> 
>As I mentioned during the FuSa call, I agree with you, Andrew, and
>others that it would be best to have the docs under a CC license. I do
>expect that we'll end up copy/pasting snippets of in-code comments into
>the docs, so I think it is important that we are allowed to do that from
>a license perspective. It is great that GPLv2 allows it (we need to be
>sure about this).
> 
> The GPL does *not* allow this, but (c) law and fair use clauses do. So 
> typically
> stuff such as
> * Referring to function names, signatures, etc. tend to be all fine
> * Copying large portions of in-line comments would not be fine, but
> If they are large, they would in most cases be re-written in a more suitable
> language. 
> 
> So, I think overall, we should be fine. It's a bit of a grey area though.
> 
> And as you point out below, most of the code in question is typically BSD 
> 
>Yes, I expect that some docs might be automatically generated, but from
>header files, not from source code. Especailly public/ header files,
>which are typically BSD, not GPLv2. I cannot come up with examples of
>docs we need to generated from GPLv2-only code at the moment, hopefully
>there won't be any.
> 
> That makes things a lot easier.
> 
>>I wasn't planning on reusing any of the markup, and wasn't expecting to
>>use much of the text either.  I'm still considering the option of
>>defining that xen/public/* isn't the canonical description of the ABI,
>>because C is the wrong tool for the job.
>> 
>>Its fine to provide a C set of headers implementing an ABI, but there is
>>a very deliberate reason why the canonical migration v2 spec is in a
>>text document.
>> 
>> @Stefano: as you and I believe Brian will be spending time on improving the
>> ABI docs, I think we need to build some agreement here on what/how
>> to do it. I was assuming that generally the consensus was to have
>> docs close to the code in source, but this does not seem to be the case.
>> 
>> But if we do have stuff separately, ideally we would have a tool that helps
>> point people editing headers to also look at the relevant docs. Otherwise it 
>> will
>> be hard to keep them in sync.
> 
>In general, it is a good idea to keep the docs close to the code to make
>it easier to keep them up to date. But there is no one-size-fits-all
>here. For public ABI descriptions, I agree with Andrew that ideally they
>should not be defined as C header files.
> 
>But it is not an issue: any work that we do here won't be wasted. For
>instance, we could start by adding more comments to the current header
>files. Then, as a second step, take all the comments and turn them into
>a proper ABI description document without any C function declarations.
>It is easy to move English text around, as long as the license allows it
>-- that is the only potential blocker I can see.
> 
> This is likely to be problematic. First of all, we are talking about 
> BSD-3-Clause
> or BSD-2-Clause code (the latter is more dominant in headers I believe) in
> all known cases.
> 
> The main properties of the BSD are
> 1: Can be pretty much used anywhere for any purpose
> 2: Can be modified for any purpose 
> 3: But the original license header must be retained in derivates

This is equivalent to attribution of the copyright owner of the originally 
created file.

> Does *not* have requirements around attribution as CC-BY-4: however,
> as we store everything in git attribution is handled by us by default

See above, the license header attributes copyright, since BSD was created for 
"software" and people who work on "software" would typically be looking at 
source code, hence the primary attribution takes place there, with secondary 
attribution in EULAs, "About" panels, etc.

> CC-BY-4 also has properties 1-3
> In addition: it does require that 
> 4: Derived works are giving appropriate credit to authors 
>We could clarify in a COPYING how we prefer to do this
>4.1: We could say 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-14 Thread P S
On Oct 11, 2019, at 07:11, Lars Kurth  wrote:
> 
> On 11/10/2019, 02:24, "Stefano Stabellini"  wrote:
> 
>>On Thu, 10 Oct 2019, Lars Kurth wrote:
>> * Would we ever include API docs generated from GPLv2 code? E.g. for safety 
>> use-cases?
>> @Stefano, @Artem: I guess this one is for you. 
>> I suppose if we would have a similar issue for a safety manual
>> I am also assuming we would want to use sphinx docs and rst to generate a 
>> future safety manual
> 
>Hi Lars,
> 
>Thanks for putting this email together.
> 
>In terms of formats, I don't have a preference between rst and pandoc,
>but if we are going to use rst going forward, I'd say to try to use rst
>for everything, including converting all the old stuff. The fewer
>different formats, the better.
> 
> I think the proposal that needs to follow on from this (which would at some
> point need to be voted on) would then be to go for rst. 
> 
>As I mentioned during the FuSa call, I agree with you, Andrew, and
>others that it would be best to have the docs under a CC license. I do
>expect that we'll end up copy/pasting snippets of in-code comments into
>the docs, so I think it is important that we are allowed to do that from
>a license perspective. It is great that GPLv2 allows it (we need to be
>sure about this).
> 
> The GPL does *not* allow this, but (c) law and fair use clauses do. So 
> typically
> stuff such as
> * Referring to function names, signatures, etc. tend to be all fine
> * Copying large portions of in-line comments would not be fine, but
> If they are large, they would in most cases be re-written in a more suitable
> language. 
> 
> So, I think overall, we should be fine. It's a bit of a grey area though.
> 
> And as you point out below, most of the code in question is typically BSD 
> 
>Yes, I expect that some docs might be automatically generated, but from
>header files, not from source code. Especailly public/ header files,
>which are typically BSD, not GPLv2. I cannot come up with examples of
>docs we need to generated from GPLv2-only code at the moment, hopefully
>there won't be any.
> 
> That makes things a lot easier.
> 
>>I wasn't planning on reusing any of the markup, and wasn't expecting to
>>use much of the text either.  I'm still considering the option of
>>defining that xen/public/* isn't the canonical description of the ABI,
>>because C is the wrong tool for the job.
>> 
>>Its fine to provide a C set of headers implementing an ABI, but there is
>>a very deliberate reason why the canonical migration v2 spec is in a
>>text document.
>> 
>> @Stefano: as you and I believe Brian will be spending time on improving the
>> ABI docs, I think we need to build some agreement here on what/how
>> to do it. I was assuming that generally the consensus was to have
>> docs close to the code in source, but this does not seem to be the case.
>> 
>> But if we do have stuff separately, ideally we would have a tool that helps
>> point people editing headers to also look at the relevant docs. Otherwise it 
>> will
>> be hard to keep them in sync.
> 
>In general, it is a good idea to keep the docs close to the code to make
>it easier to keep them up to date. But there is no one-size-fits-all
>here. For public ABI descriptions, I agree with Andrew that ideally they
>should not be defined as C header files.
> 
>But it is not an issue: any work that we do here won't be wasted. For
>instance, we could start by adding more comments to the current header
>files. Then, as a second step, take all the comments and turn them into
>a proper ABI description document without any C function declarations.
>It is easy to move English text around, as long as the license allows it
>-- that is the only potential blocker I can see.
> 
> This is likely to be problematic. First of all, we are talking about 
> BSD-3-Clause
> or BSD-2-Clause code (the latter is more dominant in headers I believe) in
> all known cases.
> 
> The main properties of the BSD are
> 1: Can be pretty much used anywhere for any purpose
> 2: Can be modified for any purpose 
> 3: But the original license header must be retained in derivates

This is equivalent to attribution of the copyright owner of the originally 
created file.

> Does *not* have requirements around attribution as CC-BY-4: however,
> as we store everything in git attribution is handled by us by default

See above, the license header attributes copyright, since BSD was created for 
"software" and people who work on "software" would typically be looking at 
source code, hence the primary attribution takes place there, with secondary 
attribution in EULAs, "About" panels, etc.

> CC-BY-4 also has properties 1-3
> In addition: it does require that 
> 4: Derived works are giving appropriate credit to authors 
>We could clarify in a COPYING how we prefer to do this
>4.1: We could say 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-11 Thread Stefano Stabellini
On Fri, 11 Oct 2019, Lars Kurth wrote:
> On 11/10/2019, 02:24, "Stefano Stabellini"  wrote:
> 
> On Thu, 10 Oct 2019, Lars Kurth wrote:
> > * Would we ever include API docs generated from GPLv2 code? E.g. for 
> safety use-cases?
> > @Stefano, @Artem: I guess this one is for you. 
> > I suppose if we would have a similar issue for a safety manual
> > I am also assuming we would want to use sphinx docs and rst to generate 
> a future safety manual
> 
> Hi Lars,
> 
> Thanks for putting this email together.
> 
> In terms of formats, I don't have a preference between rst and pandoc,
> but if we are going to use rst going forward, I'd say to try to use rst
> for everything, including converting all the old stuff. The fewer
> different formats, the better.
> 
> I think the proposal that needs to follow on from this (which would at some
> point need to be voted on) would then be to go for rst. 
> 
> As I mentioned during the FuSa call, I agree with you, Andrew, and
> others that it would be best to have the docs under a CC license. I do
> expect that we'll end up copy/pasting snippets of in-code comments into
> the docs, so I think it is important that we are allowed to do that from
> a license perspective. It is great that GPLv2 allows it (we need to be
> sure about this).
> 
> The GPL does *not* allow this, but (c) law and fair use clauses do. So 
> typically
> stuff such as
> * Referring to function names, signatures, etc. tend to be all fine
> * Copying large portions of in-line comments would not be fine, but
> If they are large, they would in most cases be re-written in a more suitable
> language. 
> 
> So, I think overall, we should be fine. It's a bit of a grey area though.
> 
> And as you point out below, most of the code in question is typically BSD 
> 
> Yes, I expect that some docs might be automatically generated, but from
> header files, not from source code. Especailly public/ header files,
> which are typically BSD, not GPLv2. I cannot come up with examples of
> docs we need to generated from GPLv2-only code at the moment, hopefully
> there won't be any.
> 
> That makes things a lot easier.
>  
> > I wasn't planning on reusing any of the markup, and wasn't 
> expecting to
> > use much of the text either.  I'm still considering the option of
> > defining that xen/public/* isn't the canonical description of the 
> ABI,
> > because C is the wrong tool for the job.
> > 
> > Its fine to provide a C set of headers implementing an ABI, but 
> there is
> > a very deliberate reason why the canonical migration v2 spec is in a
> > text document.
> > 
> > @Stefano: as you and I believe Brian will be spending time on improving 
> the
> > ABI docs, I think we need to build some agreement here on what/how
> > to do it. I was assuming that generally the consensus was to have
> > docs close to the code in source, but this does not seem to be the case.
> > 
> > But if we do have stuff separately, ideally we would have a tool that 
> helps
> > point people editing headers to also look at the relevant docs. 
> Otherwise it will
> > be hard to keep them in sync.
> 
> In general, it is a good idea to keep the docs close to the code to make
> it easier to keep them up to date. But there is no one-size-fits-all
> here. For public ABI descriptions, I agree with Andrew that ideally they
> should not be defined as C header files.
> 
> But it is not an issue: any work that we do here won't be wasted. For
> instance, we could start by adding more comments to the current header
> files. Then, as a second step, take all the comments and turn them into
> a proper ABI description document without any C function declarations.
> It is easy to move English text around, as long as the license allows it
> -- that is the only potential blocker I can see.
> 
> This is likely to be problematic. First of all, we are talking about 
> BSD-3-Clause
> or BSD-2-Clause code (the latter is more dominant in headers I believe) in
> all known cases.
> 
> The main properties of the BSD are
> 1: Can be pretty much used anywhere for any purpose
> 2: Can be modified for any purpose 
> 3: But the original license header must be retained in derivates
> 
> Does *not* have requirements around attribution as CC-BY-4: however,
> as we store everything in git attribution is handled by us by default
> 
> CC-BY-4 also has properties 1-3
> In addition: it does require that 
> 4: Derived works are giving appropriate credit to authors 
> We could clarify in a COPYING how we prefer to do this
> 4.1: We could say that "referring to the Xen Project community" 
> is sufficient to comply with the attribution clause
> 4.2: We could require individual authors to be credited: in 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-11 Thread Stefano Stabellini
On Fri, 11 Oct 2019, Lars Kurth wrote:
> On 11/10/2019, 09:32, "Jan Beulich"  wrote:
> 
> On 10.10.2019 20:30, Lars Kurth wrote:
> > On 10/10/2019, 18:05, "Andrew Cooper"  wrote:
> > On 10/10/2019 13:34, Lars Kurth wrote:
> > > Existing formats and licenses
> > > * Hypercall ABI Documentation generated from Xen public headers
> > > Format: kerndoc
> > > License: typically BSD-3-Clause (documentation is generated from 
> public headers)
> > 
> > Its homegrown markup, superimposed on what used to be doxygen 
> in
> > the past.
> > 
> > Oh, I forgot
> > 
> > I wasn't planning on reusing any of the markup, and wasn't 
> expecting to
> > use much of the text either.  I'm still considering the option of
> > defining that xen/public/* isn't the canonical description of the 
> ABI,
> > because C is the wrong tool for the job.
> > 
> > Its fine to provide a C set of headers implementing an ABI, but 
> there is
> > a very deliberate reason why the canonical migration v2 spec is in a
> > text document.
> > 
> > @Stefano: as you and I believe Brian will be spending time on improving 
> the
> > ABI docs, I think we need to build some agreement here on what/how
> > to do it. I was assuming that generally the consensus was to have
> > docs close to the code in source, but this does not seem to be the case.
> 
> Well, for migration v2 having the spec in a text file seems sensible
> to me. For the public ABI, however, I think it's more helpful to have
> the doc next to the actual definitions. Considering the possible use
> of languages other than C I can certainly see why separating both
> would be even more clean, but I think here we want to weigh practical
> purposes against cleanliness.
> 
> I think that is an area where we need to build some consensus. The problem
> falls under what is considered "traceability" in safety speak: in other 
> words, 
> for the ABI documentation use-case it must be easy to be able to
> keep documentation and code in sync. And ideally, we would be able to
> check this automatically somehow, or have a bot provide hints such as 
> "You changed XYZ and should have a look and check whether ABC needs
> changing also".
> 
> I have thought about the problem of "traceability" for some time, which
> goes far beyond what we need for this use-case. Typical things that need
> to be maintained for a "traceable (safety) documentation set" are
> 
> ## Keeping key docs and code in sync 
> The use-cases here are things such as
> - keep man pages and xl sources in sync
> - keep ABI docs and headers in sync
> - keep documents such as the migration b2 spec in sync with
>   actual source
>  
> This is a problem we already have today and where we do this often
> fairly poorly manually (as can be seen on how out-of-date
> man pages often are)
> 
> Possible solutions for this are
> - store docs alongside headers (maybe using the same base
> file name) => that would work for ABI docs
> 
> - have some tagging or meta-information scheme which links
> specific source files to docs files => that would work for most
> other docs (albeit not always perfectly - e.g. when functionality
> is spread over many files and just portions of them)
> 
> For example: tools/xl/xl_cmdtable.c  
> is linked to files in docs/man/xl*
> 
> This means creating a bot/tool which warns that when you change
> foo.c to also look at foo.rst and/or ../../docs/.../bar.rst should be
> relatively straightforward. It would require some initial effort
> to create initial mappings, but these would never really change,
> unless we refactor code significantly.
>  
> ## Keeping dependent documents (or portions of documents) in sync
> This is something we have not really faced, because we do not
> have a lot of docs.  
> 
> In a large documentation set having the right chapter/tree
> structure enables this. In waterfall software engineering
> models, where you start off with high-level documents/
> requirements/specs/etc. documents which become
> increasingly detailed this is done through a chapter/tree
> structure, with the capability to make separate documents
> (or sections thereof) on other documents (or sections
> thereof). When you change something, a tool such as DOORS 
> forces you review and sign off child documents.
> 
> This is conceptually similar to what we would need for
> "linking" sources to docs as outlined above, only that
> the "linking" is between docs. It would also be easy enough
> to check and highlight, what else may have to be looked at.  
> 
> ## Proving that your tests verify design claims and that *all claims* are 
> tested
> This is typically the hardest problem to solve. It requires
> for test cases (be it a document or actual code) to
> link to claims (in a design, architecture, spec, ...)
> and to prove that all of them are 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-11 Thread Lars Kurth


On 11/10/2019, 09:32, "Jan Beulich"  wrote:

On 10.10.2019 20:30, Lars Kurth wrote:
> On 10/10/2019, 18:05, "Andrew Cooper"  wrote:
> On 10/10/2019 13:34, Lars Kurth wrote:
> > Existing formats and licenses
> > * Hypercall ABI Documentation generated from Xen public headers
> > Format: kerndoc
> > License: typically BSD-3-Clause (documentation is generated from 
public headers)
> 
> Its homegrown markup, superimposed on what used to be doxygen in
> the past.
> 
> Oh, I forgot
> 
> I wasn't planning on reusing any of the markup, and wasn't expecting 
to
> use much of the text either.  I'm still considering the option of
> defining that xen/public/* isn't the canonical description of the ABI,
> because C is the wrong tool for the job.
> 
> Its fine to provide a C set of headers implementing an ABI, but there 
is
> a very deliberate reason why the canonical migration v2 spec is in a
> text document.
> 
> @Stefano: as you and I believe Brian will be spending time on improving 
the
> ABI docs, I think we need to build some agreement here on what/how
> to do it. I was assuming that generally the consensus was to have
> docs close to the code in source, but this does not seem to be the case.

Well, for migration v2 having the spec in a text file seems sensible
to me. For the public ABI, however, I think it's more helpful to have
the doc next to the actual definitions. Considering the possible use
of languages other than C I can certainly see why separating both
would be even more clean, but I think here we want to weigh practical
purposes against cleanliness.

I think that is an area where we need to build some consensus. The problem
falls under what is considered "traceability" in safety speak: in other words, 
for the ABI documentation use-case it must be easy to be able to
keep documentation and code in sync. And ideally, we would be able to
check this automatically somehow, or have a bot provide hints such as 
"You changed XYZ and should have a look and check whether ABC needs
changing also".

I have thought about the problem of "traceability" for some time, which
goes far beyond what we need for this use-case. Typical things that need
to be maintained for a "traceable (safety) documentation set" are

## Keeping key docs and code in sync 
The use-cases here are things such as
- keep man pages and xl sources in sync
- keep ABI docs and headers in sync
- keep documents such as the migration b2 spec in sync with
  actual source
 
This is a problem we already have today and where we do this often
fairly poorly manually (as can be seen on how out-of-date
man pages often are)

Possible solutions for this are
- store docs alongside headers (maybe using the same base
file name) => that would work for ABI docs

- have some tagging or meta-information scheme which links
specific source files to docs files => that would work for most
other docs (albeit not always perfectly - e.g. when functionality
is spread over many files and just portions of them)

For example: tools/xl/xl_cmdtable.c  
is linked to files in docs/man/xl*

This means creating a bot/tool which warns that when you change
foo.c to also look at foo.rst and/or ../../docs/.../bar.rst should be
relatively straightforward. It would require some initial effort
to create initial mappings, but these would never really change,
unless we refactor code significantly.
 
## Keeping dependent documents (or portions of documents) in sync
This is something we have not really faced, because we do not
have a lot of docs.  

In a large documentation set having the right chapter/tree
structure enables this. In waterfall software engineering
models, where you start off with high-level documents/
requirements/specs/etc. documents which become
increasingly detailed this is done through a chapter/tree
structure, with the capability to make separate documents
(or sections thereof) on other documents (or sections
thereof). When you change something, a tool such as DOORS 
forces you review and sign off child documents.

This is conceptually similar to what we would need for
"linking" sources to docs as outlined above, only that
the "linking" is between docs. It would also be easy enough
to check and highlight, what else may have to be looked at.  

## Proving that your tests verify design claims and that *all claims* are tested
This is typically the hardest problem to solve. It requires
for test cases (be it a document or actual code) to
link to claims (in a design, architecture, spec, ...)
and to prove that all of them are tested.

If there is linkage capability, then it is straightforward
to verify automatically that all your branches have
test-case leaves in your documentation tree. But at least
in a safety context you would also have to augment this
with code coverage 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-11 Thread Lars Kurth


On 11/10/2019, 02:24, "Stefano Stabellini"  wrote:

On Thu, 10 Oct 2019, Lars Kurth wrote:
> * Would we ever include API docs generated from GPLv2 code? E.g. for 
safety use-cases?
> @Stefano, @Artem: I guess this one is for you. 
> I suppose if we would have a similar issue for a safety manual
> I am also assuming we would want to use sphinx docs and rst to generate a 
future safety manual

Hi Lars,

Thanks for putting this email together.

In terms of formats, I don't have a preference between rst and pandoc,
but if we are going to use rst going forward, I'd say to try to use rst
for everything, including converting all the old stuff. The fewer
different formats, the better.

I think the proposal that needs to follow on from this (which would at some
point need to be voted on) would then be to go for rst. 

As I mentioned during the FuSa call, I agree with you, Andrew, and
others that it would be best to have the docs under a CC license. I do
expect that we'll end up copy/pasting snippets of in-code comments into
the docs, so I think it is important that we are allowed to do that from
a license perspective. It is great that GPLv2 allows it (we need to be
sure about this).

The GPL does *not* allow this, but (c) law and fair use clauses do. So typically
stuff such as
* Referring to function names, signatures, etc. tend to be all fine
* Copying large portions of in-line comments would not be fine, but
If they are large, they would in most cases be re-written in a more suitable
language. 

So, I think overall, we should be fine. It's a bit of a grey area though.

And as you point out below, most of the code in question is typically BSD 

Yes, I expect that some docs might be automatically generated, but from
header files, not from source code. Especailly public/ header files,
which are typically BSD, not GPLv2. I cannot come up with examples of
docs we need to generated from GPLv2-only code at the moment, hopefully
there won't be any.

That makes things a lot easier.
 
> I wasn't planning on reusing any of the markup, and wasn't expecting 
to
> use much of the text either.  I'm still considering the option of
> defining that xen/public/* isn't the canonical description of the ABI,
> because C is the wrong tool for the job.
> 
> Its fine to provide a C set of headers implementing an ABI, but there 
is
> a very deliberate reason why the canonical migration v2 spec is in a
> text document.
> 
> @Stefano: as you and I believe Brian will be spending time on improving 
the
> ABI docs, I think we need to build some agreement here on what/how
> to do it. I was assuming that generally the consensus was to have
> docs close to the code in source, but this does not seem to be the case.
> 
> But if we do have stuff separately, ideally we would have a tool that 
helps
> point people editing headers to also look at the relevant docs. Otherwise 
it will
> be hard to keep them in sync.

In general, it is a good idea to keep the docs close to the code to make
it easier to keep them up to date. But there is no one-size-fits-all
here. For public ABI descriptions, I agree with Andrew that ideally they
should not be defined as C header files.

But it is not an issue: any work that we do here won't be wasted. For
instance, we could start by adding more comments to the current header
files. Then, as a second step, take all the comments and turn them into
a proper ABI description document without any C function declarations.
It is easy to move English text around, as long as the license allows it
-- that is the only potential blocker I can see.

This is likely to be problematic. First of all, we are talking about 
BSD-3-Clause
or BSD-2-Clause code (the latter is more dominant in headers I believe) in
all known cases.

The main properties of the BSD are
1: Can be pretty much used anywhere for any purpose
2: Can be modified for any purpose 
3: But the original license header must be retained in derivates

Does *not* have requirements around attribution as CC-BY-4: however,
as we store everything in git attribution is handled by us by default

CC-BY-4 also has properties 1-3
In addition: it does require that 
4: Derived works are giving appropriate credit to authors 
We could clarify in a COPYING how we prefer to do this
4.1: We could say that "referring to the Xen Project community" 
is sufficient to comply with the attribution clause
4.2: We could require individual authors to be credited: in that
case we probably ought to lead by example and list the authors
in a credit/license section and extract the information from
git logs when we generate it (at some point in the future)
5: You give an indication 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-11 Thread Artem Mygaiev
Hi Lars

On Thu, 2019-10-10 at 12:34 +, Lars Kurth wrote:
> * Possibly stuff such as 
> https://urldefense.com/v3/__https://xenbits.xen.org/docs/unstable/support-matrix.html__;!K6dmGCEab4ueJg!lwAwYJi7cUkbX7CUXnOD9i7laj_9xcyafF714u6PO04tu0CYUKDHWBHAy2XD0mvEiA$
>   (which is currently GPL-2,
>but we could relicense to say GPL-2 and CC-BY-4 if we had to)
> The implication is that the sphinx docs would not be fully CC-BY-4,
> but the bulk of the pages would be
> 
> * Would we ever include API docs generated from GPLv2 code? E.g. for
> safety use-cases?
> @Stefano, @Artem: I guess this one is for you. 
> I suppose if we would have a similar issue for a safety manual
> I am also assuming we would want to use sphinx docs and rst to
> generate a future safety manual
> 

Yes, I think we will have to use some API docs in safety related
documentation. But I do not see any issue with that because using
"description" part of headers in documentation can be treated as "fair
use" and thus will not create a license conflict as confirmed in 
https://www.gnu.org/licenses/gpl-faq.en.html#SourceCodeInDocumentation

 -- Artem
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-11 Thread Jan Beulich
On 10.10.2019 20:30, Lars Kurth wrote:
> On 10/10/2019, 18:05, "Andrew Cooper"  wrote:
> On 10/10/2019 13:34, Lars Kurth wrote:
> > Existing formats and licenses
> > * Hypercall ABI Documentation generated from Xen public headers
> > Format: kerndoc
> > License: typically BSD-3-Clause (documentation is generated from public 
> headers)
> 
> Its homegrown markup, superimposed on what used to be doxygen in
> the past.
> 
> Oh, I forgot
> 
> I wasn't planning on reusing any of the markup, and wasn't expecting to
> use much of the text either.  I'm still considering the option of
> defining that xen/public/* isn't the canonical description of the ABI,
> because C is the wrong tool for the job.
> 
> Its fine to provide a C set of headers implementing an ABI, but there is
> a very deliberate reason why the canonical migration v2 spec is in a
> text document.
> 
> @Stefano: as you and I believe Brian will be spending time on improving the
> ABI docs, I think we need to build some agreement here on what/how
> to do it. I was assuming that generally the consensus was to have
> docs close to the code in source, but this does not seem to be the case.

Well, for migration v2 having the spec in a text file seems sensible
to me. For the public ABI, however, I think it's more helpful to have
the doc next to the actual definitions. Considering the possible use
of languages other than C I can certainly see why separating both
would be even more clean, but I think here we want to weigh practical
purposes against cleanliness.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-10 Thread Stefano Stabellini
On Thu, 10 Oct 2019, Lars Kurth wrote:
> * Would we ever include API docs generated from GPLv2 code? E.g. for safety 
> use-cases?
> @Stefano, @Artem: I guess this one is for you. 
> I suppose if we would have a similar issue for a safety manual
> I am also assuming we would want to use sphinx docs and rst to generate a 
> future safety manual

Hi Lars,

Thanks for putting this email together.

In terms of formats, I don't have a preference between rst and pandoc,
but if we are going to use rst going forward, I'd say to try to use rst
for everything, including converting all the old stuff. The fewer
different formats, the better.

As I mentioned during the FuSa call, I agree with you, Andrew, and
others that it would be best to have the docs under a CC license. I do
expect that we'll end up copy/pasting snippets of in-code comments into
the docs, so I think it is important that we are allowed to do that from
a license perspective. It is great that GPLv2 allows it (we need to be
sure about this).

Yes, I expect that some docs might be automatically generated, but from
header files, not from source code. Especailly public/ header files,
which are typically BSD, not GPLv2. I cannot come up with examples of
docs we need to generated from GPLv2-only code at the moment, hopefully
there won't be any.


 
> I wasn't planning on reusing any of the markup, and wasn't expecting to
> use much of the text either.  I'm still considering the option of
> defining that xen/public/* isn't the canonical description of the ABI,
> because C is the wrong tool for the job.
> 
> Its fine to provide a C set of headers implementing an ABI, but there is
> a very deliberate reason why the canonical migration v2 spec is in a
> text document.
> 
> @Stefano: as you and I believe Brian will be spending time on improving the
> ABI docs, I think we need to build some agreement here on what/how
> to do it. I was assuming that generally the consensus was to have
> docs close to the code in source, but this does not seem to be the case.
> 
> But if we do have stuff separately, ideally we would have a tool that helps
> point people editing headers to also look at the relevant docs. Otherwise it 
> will
> be hard to keep them in sync.

In general, it is a good idea to keep the docs close to the code to make
it easier to keep them up to date. But there is no one-size-fits-all
here. For public ABI descriptions, I agree with Andrew that ideally they
should not be defined as C header files.

But it is not an issue: any work that we do here won't be wasted. For
instance, we could start by adding more comments to the current header
files. Then, as a second step, take all the comments and turn them into
a proper ABI description document without any C function declarations.
It is easy to move English text around, as long as the license allows it
-- that is the only potential blocker I can see.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-10 Thread Lars Kurth


On 10/10/2019, 18:05, "Andrew Cooper"  wrote:

On 10/10/2019 13:34, Lars Kurth wrote:
> Hi all,
>
> following on from a discussion on IRC and on various other places, I 
think we need to try and rationalize how we handle documentation.
>
> What we have now and what we may get in future
> * http://xenbits.xen.org/docs/unstable/ (GPL-2)
> * http://xenbits.xen.org/docs/sphinx-unstable-staging/ (CC-BY-4)
> * Additional API documentation (with a view to enabling safety) 
> * Any future documentation related to safety (requirements, designs, test 
cases, tracability)
>
> Desired licenses
> * There is a desire to keep 
http://xenbits.xen.org/docs/sphinx-unstable-staging/ CC-BY-4 only
> * There is a desire to publish future documentation related to safety as 
CC-BY-4

Its probably worth nothing that the
http://xenbits.xen.org/docs/sphinx-unstable-staging/ URL is only
transitional.

When Sphinx is more ready for primetime, I was thinking of using
http://xenbits.xen.org/docs/xen/, and using the Sphinx support for
multiple versions, which would end up becoming docs/xen/{4.13,...,latest}/

> Existing formats and licenses
> * Hypercall ABI Documentation generated from Xen public headers
> Format: kerndoc
> License: typically BSD-3-Clause (documentation is generated from public 
headers)

Its homegrown markup, superimposed on what used to be doxygen in
the past.

Oh, I forgot

I wasn't planning on reusing any of the markup, and wasn't expecting to
use much of the text either.  I'm still considering the option of
defining that xen/public/* isn't the canonical description of the ABI,
because C is the wrong tool for the job.

Its fine to provide a C set of headers implementing an ABI, but there is
a very deliberate reason why the canonical migration v2 spec is in a
text document.

@Stefano: as you and I believe Brian will be spending time on improving the
ABI docs, I think we need to build some agreement here on what/how
to do it. I was assuming that generally the consensus was to have
docs close to the code in source, but this does not seem to be the case.

But if we do have stuff separately, ideally we would have a tool that helps
point people editing headers to also look at the relevant docs. Otherwise it 
will
be hard to keep them in sync.

> * docs/designs, docs/features, docs/specs
> Formats: primarily pandoc, with some files md
> License: GPL-2
> * docs/processs - covers internal processes
> Formats: txt, with some pandoc
> License: GPL-2
> * docs/figs
> Formats: misc
> License: GPL-2
> * docs/misc
> Formats: txt, with some large number of pandoc, some other docs
> License: GPL-2
> * docs/man
> Formats: pod
> License: GPL-2
> * Sphinx docs: docs, docs/guest-guide, docs/hypervisor-guide
> Formats: rst
> License: CC-BY-4

This is the intention, but hasn't taken effect while my series is still
pending.  For now, strictly speaking it is still GPL-2.

I was basing this on the assumption the series will go in

> * Wiki: 
> Formats: mediawiki markdown
> License: CC-BY-SA-3 which has an automatic update to CC-BY-SA-4
> (c) of Wiki contributions are kept by the authors
>
> This means that the 3 most common file formats in use are
> * pod
> * pandoc (with some md) - these are essentially identical
> * txt for legacy and old stuff
> * rst
>
> License compatibility
> * GPL-2 and CC-BY-4 are compatible, but mixing means that the complete 
docset is GPL-2
> * GPL-2 and BSD-3-Clause are compatible, but mixing means that the 
complete docset is GPL-2
> * BSD-3-Clause and CC-BY-4 I am not 100% sure, but should not be an issue
> * CC-BY-SA-4 is only one way compatible with GPLv3 (affecting content on 
the wiki)
>
> The first question is whether we should convert pod to rst
> * https://metacpan.org/pod/pod2rst provides a conversion tool
> * man pages can be generated by rst2man
> Thus, technically this should be easy and should make contributions to 
docs/man easier
> If we do this, we should add a CONTRIBUTING file, clarifying the license 
in this directory

One thing I have done is put SPDX tags on every *.rst file.  What I
haven't found is a nice way to insert one into the *.drawio.svg files,
but I should probably finish off some of my experimentation TODOs.

An easy way out is to just say "look at the SPDX tag", but then we end
up with a docset which is a mess of licenses, still can't be easily
built upon.

I think a per-directory approach is generally better + use SPDX tags
where it can easily be added.
And it's easy enough to do

> There are a set of related questions on what we would eventually merge 
into the sphinx
> docset. I believe there is 

Re: [Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-10 Thread Andrew Cooper
On 10/10/2019 13:34, Lars Kurth wrote:
> Hi all,
>
> following on from a discussion on IRC and on various other places, I think we 
> need to try and rationalize how we handle documentation.
>
> What we have now and what we may get in future
> * http://xenbits.xen.org/docs/unstable/ (GPL-2)
> * http://xenbits.xen.org/docs/sphinx-unstable-staging/ (CC-BY-4)
> * Additional API documentation (with a view to enabling safety) 
> * Any future documentation related to safety (requirements, designs, test 
> cases, tracability)
>
> Desired licenses
> * There is a desire to keep 
> http://xenbits.xen.org/docs/sphinx-unstable-staging/ CC-BY-4 only
> * There is a desire to publish future documentation related to safety as 
> CC-BY-4

Its probably worth nothing that the
http://xenbits.xen.org/docs/sphinx-unstable-staging/ URL is only
transitional.

When Sphinx is more ready for primetime, I was thinking of using
http://xenbits.xen.org/docs/xen/, and using the Sphinx support for
multiple versions, which would end up becoming docs/xen/{4.13,...,latest}/

> Existing formats and licenses
> * Hypercall ABI Documentation generated from Xen public headers
> Format: kerndoc
> License: typically BSD-3-Clause (documentation is generated from public 
> headers)

Its homegrown markup, superimposed on what used to be doxygen in
the past.

I wasn't planning on reusing any of the markup, and wasn't expecting to
use much of the text either.  I'm still considering the option of
defining that xen/public/* isn't the canonical description of the ABI,
because C is the wrong tool for the job.

Its fine to provide a C set of headers implementing an ABI, but there is
a very deliberate reason why the canonical migration v2 spec is in a
text document.

> * docs/designs, docs/features, docs/specs
> Formats: primarily pandoc, with some files md
> License: GPL-2
> * docs/processs - covers internal processes
> Formats: txt, with some pandoc
> License: GPL-2
> * docs/figs
> Formats: misc
> License: GPL-2
> * docs/misc
> Formats: txt, with some large number of pandoc, some other docs
> License: GPL-2
> * docs/man
> Formats: pod
> License: GPL-2
> * Sphinx docs: docs, docs/guest-guide, docs/hypervisor-guide
> Formats: rst
> License: CC-BY-4

This is the intention, but hasn't taken effect while my series is still
pending.  For now, strictly speaking it is still GPL-2.

>
> * Wiki: 
> Formats: mediawiki markdown
> License: CC-BY-SA-3 which has an automatic update to CC-BY-SA-4
> (c) of Wiki contributions are kept by the authors
>
> This means that the 3 most common file formats in use are
> * pod
> * pandoc (with some md) - these are essentially identical
> * txt for legacy and old stuff
> * rst
>
> License compatibility
> * GPL-2 and CC-BY-4 are compatible, but mixing means that the complete docset 
> is GPL-2
> * GPL-2 and BSD-3-Clause are compatible, but mixing means that the complete 
> docset is GPL-2
> * BSD-3-Clause and CC-BY-4 I am not 100% sure, but should not be an issue
> * CC-BY-SA-4 is only one way compatible with GPLv3 (affecting content on the 
> wiki)
>
> The first question is whether we should convert pod to rst
> * https://metacpan.org/pod/pod2rst provides a conversion tool
> * man pages can be generated by rst2man
> Thus, technically this should be easy and should make contributions to 
> docs/man easier
> If we do this, we should add a CONTRIBUTING file, clarifying the license in 
> this directory

One thing I have done is put SPDX tags on every *.rst file.  What I
haven't found is a nice way to insert one into the *.drawio.svg files,
but I should probably finish off some of my experimentation TODOs.

An easy way out is to just say "look at the SPDX tag", but then we end
up with a docset which is a mess of licenses, still can't be easily
built upon.

> There are a set of related questions on what we would eventually merge into 
> the sphinx
> docset. I believe there is agreement that most of what is in docs today is 
> not really
> suitable, however there are a few possible exceptions
> * man pages - with a variety of different contributors from different orgs. 
> Changing license would be hard

But certainly not impossible.

> * API docs generated from PUBLIC headers - changing license would be 
> impossible, but would be BSD-3-Clause

The code, yes, but I'm expecting that to be orthogonal in the long run.

> * Some wiki content (e.g. 
> https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches and friends) 
>More than 95% of changes were from Citrix staff, so we could convert to 
> CC-BY-4
>Most non-Citrix changes are one-line changes and could be covered by fair 
> use
> * Possibly stuff such as 
> https://xenbits.xen.org/docs/unstable/support-matrix.html (which is currently 
> GPL-2,
>but we could relicense to say GPL-2 and CC-BY-4 if we had to)
> The implication is that the sphinx docs would not be fully CC-BY-4, but the 
> bulk of the pages would be

Would be what?

~Andrew

>
> * Would we ever include API 

[Xen-devel] [RFC] Documentation formats, licenses and file system structure

2019-10-10 Thread Lars Kurth
Hi all,

following on from a discussion on IRC and on various other places, I think we 
need to try and rationalize how we handle documentation.

What we have now and what we may get in future
* http://xenbits.xen.org/docs/unstable/ (GPL-2)
* http://xenbits.xen.org/docs/sphinx-unstable-staging/ (CC-BY-4)
* Additional API documentation (with a view to enabling safety) 
* Any future documentation related to safety (requirements, designs, test 
cases, tracability)

Desired licenses
* There is a desire to keep 
http://xenbits.xen.org/docs/sphinx-unstable-staging/ CC-BY-4 only
* There is a desire to publish future documentation related to safety as CC-BY-4

Existing formats and licenses
* Hypercall ABI Documentation generated from Xen public headers
Format: kerndoc
License: typically BSD-3-Clause (documentation is generated from public headers)
* docs/designs, docs/features, docs/specs
Formats: primarily pandoc, with some files md
License: GPL-2
* docs/processs - covers internal processes
Formats: txt, with some pandoc
License: GPL-2
* docs/figs
Formats: misc
License: GPL-2
* docs/misc
Formats: txt, with some large number of pandoc, some other docs
License: GPL-2
* docs/man
Formats: pod
License: GPL-2
* Sphinx docs: docs, docs/guest-guide, docs/hypervisor-guide
Formats: rst
License: CC-BY-4

* Wiki: 
Formats: mediawiki markdown
License: CC-BY-SA-3 which has an automatic update to CC-BY-SA-4
(c) of Wiki contributions are kept by the authors

This means that the 3 most common file formats in use are
* pod
* pandoc (with some md) - these are essentially identical
* txt for legacy and old stuff
* rst

License compatibility
* GPL-2 and CC-BY-4 are compatible, but mixing means that the complete docset 
is GPL-2
* GPL-2 and BSD-3-Clause are compatible, but mixing means that the complete 
docset is GPL-2
* BSD-3-Clause and CC-BY-4 I am not 100% sure, but should not be an issue
* CC-BY-SA-4 is only one way compatible with GPLv3 (affecting content on the 
wiki)

The first question is whether we should convert pod to rst
* https://metacpan.org/pod/pod2rst provides a conversion tool
* man pages can be generated by rst2man
Thus, technically this should be easy and should make contributions to docs/man 
easier
If we do this, we should add a CONTRIBUTING file, clarifying the license in 
this directory

There are a set of related questions on what we would eventually merge into the 
sphinx
docset. I believe there is agreement that most of what is in docs today is not 
really
suitable, however there are a few possible exceptions
* man pages - with a variety of different contributors from different orgs. 
Changing license would be hard 
* API docs generated from PUBLIC headers - changing license would be 
impossible, but would be BSD-3-Clause
* Some wiki content (e.g. 
https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches and friends) 
   More than 95% of changes were from Citrix staff, so we could convert to 
CC-BY-4
   Most non-Citrix changes are one-line changes and could be covered by fair use
* Possibly stuff such as 
https://xenbits.xen.org/docs/unstable/support-matrix.html (which is currently 
GPL-2,
   but we could relicense to say GPL-2 and CC-BY-4 if we had to)
The implication is that the sphinx docs would not be fully CC-BY-4, but the 
bulk of the pages would be

* Would we ever include API docs generated from GPLv2 code? E.g. for safety 
use-cases?
@Stefano, @Artem: I guess this one is for you. 
I suppose if we would have a similar issue for a safety manual
I am also assuming we would want to use sphinx docs and rst to generate a 
future safety manual

Other pages in docs that may be useful for the sphinx docs should essentially 
be re-written, 
so we would be fine from a licensing perspective. That means that over time, we 
could get rid of 
pandoc and text files in docs/misc, docs/designs, docs/features, docs/specs 
which
have not really built a lot of traction.

Related to this is the general question, whether we would ever copy code from 
source to docs
and vice versa and to which degree. This is an unknown to me: I think in 
practice we have not seen
much or even any of this in the past.

On licensing, we should try and make the docs directory clean, e.g.
* We should set the default to CC-BY-4 (e.g. through a contributing file in 
docs)
* And specifically use GPL-2 for directories such as docs/misc, docs/man, ...

In any case, this seems all a little bit of a mess at the moment and I think we 
need to
agree on a foundation to get us to a better state. This mail is a start and 
intends to gather
input and will eventually lead to a more concrete proposal.

If I have missed anything, feel free to add

Best Regards
Lars



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel