Re: Follow-up on RDF byte range

Yev Bronshteyn Fri, 10 Jul 2015 09:00:56 -0700

Wouldn’t the definition of “a line” also vary across operating systems? Surely, 
we don’t want to depend on the existence of a VCS performing EOL replacement in 
any environment where the SPDX file is examined.
As someone pointed out in tuesday’s call, the byte indices would refer to the 
version of the file described by the checksum that is already mandatory for 
each file.



On Jul 10, 2015, at 9:16 AM, Philippe Ombredanne 
<[email protected]<mailto:[email protected]>> wrote:

On Fri, Jul 10, 2015 at 5:41 AM, Gary O'Neall 
<[email protected]<mailto:[email protected]>> wrote:
Hi Yev,
Thanks for the pointer to the pointer vocabulary.
Below are some of my thoughts - feel free to propose alternatives or provide 
more specific examples on how we may use the pointer class for Snippets.
- I do think  using the pointer classes would work for our purposes and would 
have the advantage of using an already defined vocabulary.  It is a bit more 
complex, but manageable.
- I noticed that the pointers RDF vocabulary defines byte offsets based on 1 
for the first byte in the document (not zero).  If we want to re-use these 
terms, we would need to define the byte ranges relative to 1 for both RDF and 
Tag/Value for compatibility.
- Pointers include a required property to reference the document the byte range 
applies to.  We could use the URI for the SPDX file as the value for this 
property.  This would somewhat redundant with the SPDX File property.  Not sure 
if we should retain both of these properties or not.  I'm currently leaning 
toward retaining both properties.
- There are a few choices on how to represent the byte range.  After looking 
through the doc, the ByteOffsetCompondPointer uses an offset relative to the 
startPointer (the pointer to the beginning of the range).  Based the tag/value 
definition where the start byte and end bytes are relative to the beginning of 
the file,  a StartEndPointer may be a better fit.  The startPointer and 
endPointer would be a ByteOffsetPointer class to represent a byte offset.
- If we want to include optional line number offset, we could use the 
LineCharPointer class.
Below is an example based on my understanding of the pointer vocabulary:
[....]

I think using bytes is impractical.
For instance, files from a simple git or svn checkout may be different
byte for byte on different machines and different settings
(end-of-line replacement, keyword substitution, etc) .

We are talking about line-oriented text source code.
Why not use the simpler, natural and human-understandable start and end line?
The compounded complexity of RDF and bytes is unlikely warranted here.

--
Cordially
Philippe Ombredanne
_______________________________________________
Spdx-tech mailing list
[email protected]<mailto:[email protected]>
https://lists.spdx.org/mailman/listinfo/spdx-tech

_______________________________________________
Spdx-tech mailing list
[email protected]
https://lists.spdx.org/mailman/listinfo/spdx-tech

Re: Follow-up on RDF byte range

Reply via email to