juergbi commented on code in PR #1997:
URL: https://github.com/apache/buildstream/pull/1997#discussion_r1995346164
##########
src/buildstream/source.py:
##########
@@ -262,10 +262,123 @@ def __init__(
@dataclass
class AliasSubstitution:
+ """AliasSubstitution()
+ An opaque data structure which may be passed through
+ :func:`SourceFetcher.fetch() <buildstream.source.SourceFetcher.fetch>` and
in such cases
+ must be provided to :func:`Source.translate_url()
<buildstream.source.Source.translate_url>`.
+ """
+
_effective_alias: str
_mirror: Union[SourceMirror, str]
+class SourceInfoMedium(FastEnum):
+ """
+ Indicates the meduim in which the source is obtained
+
+ *Since: 2.5*
+ """
+
+ LOCAL = "local"
+ """
+ Files stored locally in the project
+ """
+
+ ARCHIVE = "archive"
Review Comment:
`LOCAL` vs. `ARCHIVE` seem like orthogonal attributes to me. You can have a
local archive in your project and you can fetch a single e.g. config file from
a remote server, at least theoretically. Maybe `LOCAL` should rather be a flag
than part of this enum? (Or maybe even make it implicit as the corresponding
URL will have no host component).
I'm also not sure whether we should have one enum entry for each archive
file format (e.g. `TAR` and `ZIP`) or whether a generic `ARCHIVE` makes more
sense.
##########
src/buildstream/source.py:
##########
@@ -262,10 +262,123 @@ def __init__(
@dataclass
class AliasSubstitution:
+ """AliasSubstitution()
+ An opaque data structure which may be passed through
+ :func:`SourceFetcher.fetch() <buildstream.source.SourceFetcher.fetch>` and
in such cases
+ must be provided to :func:`Source.translate_url()
<buildstream.source.Source.translate_url>`.
+ """
+
_effective_alias: str
_mirror: Union[SourceMirror, str]
+class SourceInfoMedium(FastEnum):
+ """
+ Indicates the meduim in which the source is obtained
Review Comment:
```suggestion
Indicates the medium in which the source is obtained
```
##########
src/buildstream/downloadablefilesource.py:
##########
@@ -270,6 +270,13 @@ def fetch(self): # pylint: disable=arguments-differ
"File downloaded from {} has sha256sum '{}', not
'{}'!".format(self.url, sha256, self.ref)
)
+ def collect_source_info(self):
+ #
+ # XXX remote sources are not necessarily archives, perhaps we should
+ # allow downloadablefilesource imlementations to choose the
SourceInfoMedium
+ #
+ return [SourceInfo(self.url, SourceInfoMedium.ARCHIVE,
SourceVersionType.SHA256, self.ref)]
Review Comment:
`self.url` is a fully qualified URL after alias translation. The unique key
of a source (and thus, the cache key of an element) only covers the aliased
URL, though. I.e., `collect_source_info()` may return different fully qualified
URLs (including URLs of internal mirrors) for builds with the same cache key.
I'm not sure how to solve this but it seems like a potential issue. Or am I
misreading the code?
This issue is not specific to `DownloadableFileSource`, of course.
##########
src/buildstream/source.py:
##########
@@ -262,10 +262,123 @@ def __init__(
@dataclass
class AliasSubstitution:
+ """AliasSubstitution()
+ An opaque data structure which may be passed through
+ :func:`SourceFetcher.fetch() <buildstream.source.SourceFetcher.fetch>` and
in such cases
+ must be provided to :func:`Source.translate_url()
<buildstream.source.Source.translate_url>`.
+ """
+
_effective_alias: str
_mirror: Union[SourceMirror, str]
+class SourceInfoMedium(FastEnum):
+ """
+ Indicates the meduim in which the source is obtained
+
+ *Since: 2.5*
+ """
+
+ LOCAL = "local"
+ """
+ Files stored locally in the project
+ """
+
+ ARCHIVE = "archive"
+ """
+ An archive file
+ """
+
+ GIT = "git"
+ """
+ A git repository
+ """
+
+
+class SourceVersionType(FastEnum):
+ """
+ Indicates the type of the version string
+
+ *Since: 2.5*
+ """
+
+ VERSION = "version"
+ """
+ The upstream version string, which may be semantic version
+ """
+
+ COMMIT = "commit"
+ """
+ A commit string which accurately represents a version in a source
+ code repository or VCS
+ """
+
+ SHA256 = "sha256"
+ """
+ An sha256 checksum
+ """
+
+ DIGEST = "digest"
+ """
+ A CAS digest representing the unique version of this source input
+ """
+
+
+class SourceInfo:
+ """SourceInfo()
+
+ An object representing the provenance of input reported by
+ :func:`Source.collect_source_info()
<buildstream.source.Source.collect_source_info>`
+
+ *Since: 2.5*
+ """
+
+ def __init__(self, url: str, medium: str, version_type: str, version: str):
+ # XXX assert medium and version_type are valid values for the enums
+
+ self.url: str = url
+ """
+ The url of the source input
+ """
+
+ self.medium: str = medium
+ """
+ The :class:`.SourceInfoMedium` of the source input
+ """
+
+ self.version_type: str = version_type
Review Comment:
A single source URL may have both an upstream version and e.g. a commit or
hash. Do we want/need to support this? Or is the idea to handle this with
multiple `SourceInfo` objects pointing to the same URL?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]