Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-14 Thread Hiroaki Yutani
I just found the policy is updated and I now understand why GitHub matters
in your opinion. Thanks for the clarification, I forgot this fact.

>  CRAN does not regard github.com (which hosts the index of crates.io) as
sufficiently reliable.

The good news is that, as of Rust 1.68, Cargo supports the "sparse" index
protocol [1][2]. In this case, the index is hosted at
https://index.crates.io/, crates.io's own infrastructure. So, if I
understand correctly, when all the CRAN servers have Cargo >=1.68
installed, CRAN is ready to believe crates.io is reliable?

Note that, at the time of writing this, the version on Debian testing is
still 1.66 [3] and it's not updated very frequently (about once a year?),
so it probably takes a while before the day.

Best,
Yutani


[1]:
https://blog.rust-lang.org/2023/03/09/Rust-1.68.0.html#cargos-sparse-protocol
[2]:
https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-sparse-protocol.html
[3]: https://packages.debian.org/testing/cargo (it seems 0.66 means 1.66)

2023年7月14日(金) 9:58 Hiroaki Yutani :

> Simin,
>
> Sorry that my question was not clear. Let me clarify.
>
> I think we all agree that "cargo vendor" is the primary option. Since
> downloading without explicit permission is not allowed on CRAN in general,
> it's reasonable. I'm happy that the instructions will describe it clearly.
>
> But, some R packages have too large dependencies to bundle. In this case,
> downloading can be allowed with "the explicit permission of the CRAN team,"
> if I understand correctly. For this, I think Cargo's downloading mechanism
> satisfy this requirement if (1) all the dependencies are from crates.io
> and (2) Cargo.lock exists:
>
> > download a specific version from a secure site and check that the
> download is the expected code by some sort of checksum
>
> Because Cargo downloads specific versions recorded in Cargo.lock, verifies
> the checksums, and crates.io is the "secure site" that we can rely on as
> Hadley wrote.
>
> My question is, does CRAN allow Cargo to download the dependency sources
> on CRAN? The policy says:
>
> > So downloading of Rust ‘crates’ will in future require the explicit
> permission of the CRAN team
>
> To my eyes, this implies
>
> - CRAN currently allows Cargo's downloading of dependency Rust crates even
> without the permission
> - CRAN will keep allowing Cargo's downloading if the package author asks
> the permission
>
> And, if CRAN doesn't allow it, I (and probably many Rust users) would like
> to know why. As I described above, it should satisfy the requirement.
>
> >  please don't cross-post
>
> Sorry.
>
> > I thought cargo build --offline is not needed if the dependencies are
> already vendored?
>
> Yes, you are right. --offline is not needed if vendering is properly
> configured. But, this probably means you have to review the build
> configurations in .cargo/config.toml or so, so I just thought it would be
> easier for you to check if --offline is specified to the command. This
> seems a bit off-topic, so please ignore.
>
> Best,
> Yutani
>
>
> 2023年7月14日(金) 9:06 Simon Urbanek :
>
>>
>>
>> > On Jul 14, 2023, at 11:19 AM, Hadley Wickham 
>> wrote:
>> >
>> >>> If CRAN cannot trust even the official one of Rust, why does CRAN
>> have Rust at all?
>> >>>
>> >>
>> >> I don't see the connection - if you downloaded something in the past
>> it doesn't mean you will be able to do so in the future. And CRAN has Rust
>> because it sounded like a good idea to allow packages to use it, but I can
>> see that it opened a can of worms that we trying to tame here.
>> >
>> > Can you give a bit more detail about your concerns here? Obviously
>> > crates.io isn't some random site on the internet, it's the official
>> > repository of the Rust language, supported by the corresponding
>> > foundation for the language. To me that makes it feel very much like
>> > CRAN, where we can assume if you downloaded something in the past, you
>> > can download something in the future.
>> >
>>
>> I was just responding to Yutani's question why we downloaded the Rust
>> compilers on CRAN at all. This has really nothing to do with the previous
>> discussion which is why I did say "I don't see the connection". Also I
>> wasn't talking about crates.io anywhere in my responses in this thread.
>> The only thing I wanted to discuss here was that I think the existing Rust
>> model  ("vendor" into the package sources) seems like a good one to apply
>> to Go, but that got somehow hijacked...
>>
>> Cheers,
>> Simon
>>
>>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Hiroaki Yutani
Simin,

Sorry that my question was not clear. Let me clarify.

I think we all agree that "cargo vendor" is the primary option. Since
downloading without explicit permission is not allowed on CRAN in general,
it's reasonable. I'm happy that the instructions will describe it clearly.

But, some R packages have too large dependencies to bundle. In this case,
downloading can be allowed with "the explicit permission of the CRAN team,"
if I understand correctly. For this, I think Cargo's downloading mechanism
satisfy this requirement if (1) all the dependencies are from crates.io and
(2) Cargo.lock exists:

> download a specific version from a secure site and check that the
download is the expected code by some sort of checksum

Because Cargo downloads specific versions recorded in Cargo.lock, verifies
the checksums, and crates.io is the "secure site" that we can rely on as
Hadley wrote.

My question is, does CRAN allow Cargo to download the dependency sources on
CRAN? The policy says:

> So downloading of Rust ‘crates’ will in future require the explicit
permission of the CRAN team

To my eyes, this implies

- CRAN currently allows Cargo's downloading of dependency Rust crates even
without the permission
- CRAN will keep allowing Cargo's downloading if the package author asks
the permission

And, if CRAN doesn't allow it, I (and probably many Rust users) would like
to know why. As I described above, it should satisfy the requirement.

>  please don't cross-post

Sorry.

> I thought cargo build --offline is not needed if the dependencies are
already vendored?

Yes, you are right. --offline is not needed if vendering is properly
configured. But, this probably means you have to review the build
configurations in .cargo/config.toml or so, so I just thought it would be
easier for you to check if --offline is specified to the command. This
seems a bit off-topic, so please ignore.

Best,
Yutani


2023年7月14日(金) 9:06 Simon Urbanek :

>
>
> > On Jul 14, 2023, at 11:19 AM, Hadley Wickham 
> wrote:
> >
> >>> If CRAN cannot trust even the official one of Rust, why does CRAN have
> Rust at all?
> >>>
> >>
> >> I don't see the connection - if you downloaded something in the past it
> doesn't mean you will be able to do so in the future. And CRAN has Rust
> because it sounded like a good idea to allow packages to use it, but I can
> see that it opened a can of worms that we trying to tame here.
> >
> > Can you give a bit more detail about your concerns here? Obviously
> > crates.io isn't some random site on the internet, it's the official
> > repository of the Rust language, supported by the corresponding
> > foundation for the language. To me that makes it feel very much like
> > CRAN, where we can assume if you downloaded something in the past, you
> > can download something in the future.
> >
>
> I was just responding to Yutani's question why we downloaded the Rust
> compilers on CRAN at all. This has really nothing to do with the previous
> discussion which is why I did say "I don't see the connection". Also I
> wasn't talking about crates.io anywhere in my responses in this thread.
> The only thing I wanted to discuss here was that I think the existing Rust
> model  ("vendor" into the package sources) seems like a good one to apply
> to Go, but that got somehow hijacked...
>
> Cheers,
> Simon
>
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Dirk Eddelbuettel


The concerns over github going away (!!) (or altering references, tags,
releases, ...) may be somewhat alleviated by Software Heritage [1] covering
and 'preserving' it.  FWIW I briefly spoke about that iniative and a possible
CRAN connection at useR! in Toulouse four years ago [2].

I think I understand where CRAN is coming from. Builds for Debian have the
same requirements of 'everythin all at once better be local'.  Sadly what I
see in day to day life (hello cmake, hello vcpkg) moves firmly the other way.
We shall see how it all shakes out.

I would be very much in favor of workable rust (and then go, and so on)
solution. 

Dirk

[1] https://www.softwareheritage.org/
[2] https://dirk.eddelbuettel.com/papers/useR2019_swh_cran_talk.pdf

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Simon Urbanek



> On Jul 14, 2023, at 11:19 AM, Hadley Wickham  wrote:
> 
>>> If CRAN cannot trust even the official one of Rust, why does CRAN have Rust 
>>> at all?
>>> 
>> 
>> I don't see the connection - if you downloaded something in the past it 
>> doesn't mean you will be able to do so in the future. And CRAN has Rust 
>> because it sounded like a good idea to allow packages to use it, but I can 
>> see that it opened a can of worms that we trying to tame here.
> 
> Can you give a bit more detail about your concerns here? Obviously
> crates.io isn't some random site on the internet, it's the official
> repository of the Rust language, supported by the corresponding
> foundation for the language. To me that makes it feel very much like
> CRAN, where we can assume if you downloaded something in the past, you
> can download something in the future.
> 

I was just responding to Yutani's question why we downloaded the Rust compilers 
on CRAN at all. This has really nothing to do with the previous discussion 
which is why I did say "I don't see the connection". Also I wasn't talking 
about crates.io anywhere in my responses in this thread. The only thing I 
wanted to discuss here was that I think the existing Rust model  ("vendor" into 
the package sources) seems like a good one to apply to Go, but that got somehow 
hijacked...

Cheers,
Simon

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Duncan Murdoch

On 13/07/2023 7:19 p.m., Hadley Wickham wrote:

If CRAN cannot trust even the official one of Rust, why does CRAN have Rust at 
all?



I don't see the connection - if you downloaded something in the past it doesn't 
mean you will be able to do so in the future. And CRAN has Rust because it 
sounded like a good idea to allow packages to use it, but I can see that it 
opened a can of worms that we trying to tame here.


Can you give a bit more detail about your concerns here? Obviously
crates.io isn't some random site on the internet, it's the official
repository of the Rust language, supported by the corresponding
foundation for the language. To me that makes it feel very much like
CRAN, where we can assume if you downloaded something in the past, you
can download something in the future.


That last statement is true, but also sort of false: you should be able 
to download the same version of the package that you downloaded last 
time, but you might not be able to download a version of the package 
that works with the current version of R.


Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Hadley Wickham
> > If CRAN cannot trust even the official one of Rust, why does CRAN have Rust 
> > at all?
> >
>
> I don't see the connection - if you downloaded something in the past it 
> doesn't mean you will be able to do so in the future. And CRAN has Rust 
> because it sounded like a good idea to allow packages to use it, but I can 
> see that it opened a can of worms that we trying to tame here.

Can you give a bit more detail about your concerns here? Obviously
crates.io isn't some random site on the internet, it's the official
repository of the Rust language, supported by the corresponding
foundation for the language. To me that makes it feel very much like
CRAN, where we can assume if you downloaded something in the past, you
can download something in the future.

Hadley

-- 
http://hadley.nz

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Simon Urbanek
Yutani,

[moving back to the original thread, please don't cross-post]


> On Jul 13, 2023, at 3:34 PM, Hiroaki Yutani  wrote:
> 
> Hi Simon,
> 
> Thanks for the response. I thought
> 
>> download a specific version from a secure site and check that the
> download is the expected code by some sort of checksum
> 
> refers to the usual process that's done by Cargo automatically. If it's
> not, I think the policy should have a clear explanation. It seems it's not
> only me who wondered why this policy doesn't mention Cargo.lock at all.
> 


as explained. The instructions will be updated to make it clear that "cargo 
vendor" is the right tool here.


>> it is not expected to use cargo to resolve them from random (possibly
> inaccessible) places
> 
> Yes, I agree with you. So, I suggested the possibility of forbidding the Git 
> dependency. Or, do you call crates.io, Rust's official repository, "random 
> places"?


No, as I understand it, the lock file can have arbitrary URLs, that's what I 
was referring to.


> If CRAN cannot trust even the official one of Rust, why does CRAN have Rust 
> at all?
> 


I don't see the connection - if you downloaded something in the past it doesn't 
mean you will be able to do so in the future. And CRAN has Rust because it 
sounded like a good idea to allow packages to use it, but I can see that it 
opened a can of worms that we trying to tame here.


> That said, I agree with your concern about downloading via the Internet in
> general. Downloading is one of the common sources of failure. If you want
> to prevent cargo from downloading any source files, you can enforce adding
> --offline option to "cargo build". While the package author might feel
> unhappy, I think this would make your intent a bit clearer.
> 


I'm not a cargo expert, but I thought cargo build --offline is not needed if 
the dependencies are already vendored? If you think cargo users need more help 
with the steps, then feel free to propose what the instructions should say (we 
really assume that the authors know what they are doing).

Cheers,
Simon

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Hiroaki Yutani
Thank you for the correction. I see.

Best,
Yutani

2023年7月13日(木) 16:08 Tomas Kalibera :

>
> On 7/13/23 05:08, Hiroaki Yutani wrote:
> > I actually use cargo vendor.
> >
> >
> https://github.com/yutannihilation/string2path/blob/main/src/rust/vendor.sh
> >
> > One thing to note is that, prior to R 4.3.0, the vendored directories hit
> > the Windows' path limit so I had to put them into a TAR file. I haven't
> > tested on R 4.3.0, but probably this problem is solved by this
> improvement.
> > So, if you target only R >= 4.3, you can just cargo vendor.
> >
> >
> https://blog.r-project.org/2023/03/07/path-length-limit-on-windows/index.html
>
> I wouldn't rely on that long paths on Windows are supported even in R >=
> 4.3, because it requires at least Windows 10 1607, and it needs to be
> enabled system-wide in Windows - so, users/admins have to do that, and
> it impacts also other applications. The blog post has more details and
> recommendations.
>
> Best
> Tomas
>
> >
> > Best,
> > Yutani
> >
> > 2023年7月13日(木) 11:50 Kevin Ushey :
> >
> >> Package authors could use 'cargo vendor' to include Rust crate sources
> >> directly in their source R packages. Would that be acceptable?
> >>
> >> Presumedly, the vendored sources would be built using the versions
> >> specified in an accompanying Cargo.lock as well.
> >>
> >> https://doc.rust-lang.org/cargo/commands/cargo-vendor.html
> >>
> >>
> >> On Wed, Jul 12, 2023, 7:35 PM Simon Urbanek <
> simon.urba...@r-project.org>
> >> wrote:
> >>
> >>> Yutani,
> >>>
> >>> I'm not quite sure your reading fully matches the intent of the policy.
> >>> Cargo.lock is not sufficient, it is expected that the package will
> provide
> >>> *all* the sources, it is not expected to use cargo to resolve them from
> >>> random (possibly inaccessible) places. So the package author is
> expected to
> >>> either include the sources in the package *or* (if prohibitive due to
> >>> extreme size) have a release tar ball available at a fixed, secure,
> >>> reliable location (I was recommending Zenodo.org for that reason -
> GitHub
> >>> is neither fixed nor reliable by definition).
> >>>
> >>> Based on that, I'm not sure I fully understand the scope of your
> proposal
> >>> for improvement. Carlo.lock is certainly the first step that the
> package
> >>> author should take in creating the distribution tar ball so you can
> fix the
> >>> versions, but it is not sufficient as the next step involves
> collecting the
> >>> related sources. We don't want R users to be involved in that can of
> worms
> >>> (especially since the lock file itself provides no guarantees of
> >>> accessibility of the components and we don't want to have to manually
> >>> inspect it), the package should be ready to be used which is why it
> has to
> >>> do that step first. Does that explain the intent better? (In general,
> the
> >>> downloading at install time is actually a problem, because it's not
> >>> uncommon to use R in environments that have no Internet access, but the
> >>> download is a concession for extreme cases where the tar balls may be
> too
> >>> big to make it part of the package, but it's yet another can of
> worms...).
> >>>
> >>> Cheers,
> >>> Simon
> >>>
> >>>
> >>>
>  On 13/07/2023, at 12:37 PM, Hiroaki Yutani 
> >>> wrote:
>  Hi,
> 
>  I'm glad to see CRAN now has its official policy about Rust [1]!
>  It seems it probably needs some feedback from those who are familiar
> >>> with
>  the Rust workflow. I'm not an expert, but let me leave some quick
> >>> feedback.
>  This email is sent to the R-package-devel mailing list as well as to
> >>> cran@~
>  so that we can publicly discuss.
> 
>  It seems most of the concern is about how to make the build
> >>> deterministic.
>  In this regard, the policy should encourage including "Cargo.lock"
> file
>  [2]. Cargo.lock is created on the first compile, and the resolved
> >>> versions
>  of dependencies are recorded. As long as this file exists, the
> >>> dependency
>  versions are locked to the ones in this file, except when the package
>  author explicitly updates the versions.
> 
>  Cargo.lock also records the SHA256 checksums of the crates if they are
> >>> from
>  crates.io, Rust's official crate registry. If the checksums don't
> >>> match,
>  the build will fail with the following message:
> 
>  error: checksum for `foo v0.1.2` changed between lock files
> 
>  this could be indicative of a few possible errors:
> 
>  * the lock file is corrupt
>  * a replacement source in use (e.g., a mirror) returned a
> >>> different
>  checksum
>  * the source itself may be corrupt in one way or another
> 
>  unable to verify that `foo v0.1.2` is the same as when the
> lockfile
> >>> was
>  generated
> 
>  For dependencies from Git repositories, Cargo.lock records the commit
>  hashes. So, the 

Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-13 Thread Tomas Kalibera



On 7/13/23 05:08, Hiroaki Yutani wrote:

I actually use cargo vendor.

https://github.com/yutannihilation/string2path/blob/main/src/rust/vendor.sh

One thing to note is that, prior to R 4.3.0, the vendored directories hit
the Windows' path limit so I had to put them into a TAR file. I haven't
tested on R 4.3.0, but probably this problem is solved by this improvement.
So, if you target only R >= 4.3, you can just cargo vendor.

https://blog.r-project.org/2023/03/07/path-length-limit-on-windows/index.html


I wouldn't rely on that long paths on Windows are supported even in R >= 
4.3, because it requires at least Windows 10 1607, and it needs to be 
enabled system-wide in Windows - so, users/admins have to do that, and 
it impacts also other applications. The blog post has more details and 
recommendations.


Best
Tomas



Best,
Yutani

2023年7月13日(木) 11:50 Kevin Ushey :


Package authors could use 'cargo vendor' to include Rust crate sources
directly in their source R packages. Would that be acceptable?

Presumedly, the vendored sources would be built using the versions
specified in an accompanying Cargo.lock as well.

https://doc.rust-lang.org/cargo/commands/cargo-vendor.html


On Wed, Jul 12, 2023, 7:35 PM Simon Urbanek 
wrote:


Yutani,

I'm not quite sure your reading fully matches the intent of the policy.
Cargo.lock is not sufficient, it is expected that the package will provide
*all* the sources, it is not expected to use cargo to resolve them from
random (possibly inaccessible) places. So the package author is expected to
either include the sources in the package *or* (if prohibitive due to
extreme size) have a release tar ball available at a fixed, secure,
reliable location (I was recommending Zenodo.org for that reason - GitHub
is neither fixed nor reliable by definition).

Based on that, I'm not sure I fully understand the scope of your proposal
for improvement. Carlo.lock is certainly the first step that the package
author should take in creating the distribution tar ball so you can fix the
versions, but it is not sufficient as the next step involves collecting the
related sources. We don't want R users to be involved in that can of worms
(especially since the lock file itself provides no guarantees of
accessibility of the components and we don't want to have to manually
inspect it), the package should be ready to be used which is why it has to
do that step first. Does that explain the intent better? (In general, the
downloading at install time is actually a problem, because it's not
uncommon to use R in environments that have no Internet access, but the
download is a concession for extreme cases where the tar balls may be too
big to make it part of the package, but it's yet another can of worms...).

Cheers,
Simon




On 13/07/2023, at 12:37 PM, Hiroaki Yutani 

wrote:

Hi,

I'm glad to see CRAN now has its official policy about Rust [1]!
It seems it probably needs some feedback from those who are familiar

with

the Rust workflow. I'm not an expert, but let me leave some quick

feedback.

This email is sent to the R-package-devel mailing list as well as to

cran@~

so that we can publicly discuss.

It seems most of the concern is about how to make the build

deterministic.

In this regard, the policy should encourage including "Cargo.lock" file
[2]. Cargo.lock is created on the first compile, and the resolved

versions

of dependencies are recorded. As long as this file exists, the

dependency

versions are locked to the ones in this file, except when the package
author explicitly updates the versions.

Cargo.lock also records the SHA256 checksums of the crates if they are

from

crates.io, Rust's official crate registry. If the checksums don't

match,

the build will fail with the following message:

error: checksum for `foo v0.1.2` changed between lock files

this could be indicative of a few possible errors:

* the lock file is corrupt
* a replacement source in use (e.g., a mirror) returned a

different

checksum
* the source itself may be corrupt in one way or another

unable to verify that `foo v0.1.2` is the same as when the lockfile

was

generated

For dependencies from Git repositories, Cargo.lock records the commit
hashes. So, the version of the source code (not the version of the

crate)

is uniquely determined. That said, unlike cargo.io, it's possible that

the

commit or the Git repository itself has disappeared at the time of
building, which makes the build fail. So, it might be reasonable the

CRAN

policy prohibits the use of Git dependency unless the source code is
bundled. I have no strong opinion here.

Accordingly, I believe this sentence


In practice maintainers have found it nigh-impossible to meet these

conditions whilst downloading as they have too little control.

is not quite true. More specifically, these things


The standard way to download a Rust ‘crate’ is by its version number,

and

these have been changed without 

Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-12 Thread Hiroaki Yutani
Hi Simon,

Thanks for the response. I thought

> download a specific version from a secure site and check that the
download is the expected code by some sort of checksum

refers to the usual process that's done by Cargo automatically. If it's
not, I think the policy should have a clear explanation. It seems it's not
only me who wondered why this policy doesn't mention Cargo.lock at all.

> it is not expected to use cargo to resolve them from random (possibly
inaccessible) places

Yes, I agree with you. So, I suggested the possibility of forbidding the
Git dependency. Or, do you call crates.io, Rust's official repository,
"random places"? If CRAN cannot trust even the official one of Rust, why
does CRAN have Rust at all?

That said, I agree with your concern about downloading via the Internet in
general. Downloading is one of the common sources of failure. If you want
to prevent cargo from downloading any source files, you can enforce adding
--offline option to "cargo build". While the package author might feel
unhappy, I think this would make your intent a bit clearer.

Best,
Yutani


2023年7月13日(木) 11:34 Simon Urbanek :

> Yutani,
>
> I'm not quite sure your reading fully matches the intent of the policy.
> Cargo.lock is not sufficient, it is expected that the package will provide
> *all* the sources, it is not expected to use cargo to resolve them from
> random (possibly inaccessible) places. So the package author is expected to
> either include the sources in the package *or* (if prohibitive due to
> extreme size) have a release tar ball available at a fixed, secure,
> reliable location (I was recommending Zenodo.org for that reason - GitHub
> is neither fixed nor reliable by definition).
>
> Based on that, I'm not sure I fully understand the scope of your proposal
> for improvement. Carlo.lock is certainly the first step that the package
> author should take in creating the distribution tar ball so you can fix the
> versions, but it is not sufficient as the next step involves collecting the
> related sources. We don't want R users to be involved in that can of worms
> (especially since the lock file itself provides no guarantees of
> accessibility of the components and we don't want to have to manually
> inspect it), the package should be ready to be used which is why it has to
> do that step first. Does that explain the intent better? (In general, the
> downloading at install time is actually a problem, because it's not
> uncommon to use R in environments that have no Internet access, but the
> download is a concession for extreme cases where the tar balls may be too
> big to make it part of the package, but it's yet another can of worms...).
>
> Cheers,
> Simon
>
>
>
> > On 13/07/2023, at 12:37 PM, Hiroaki Yutani  wrote:
> >
> > Hi,
> >
> > I'm glad to see CRAN now has its official policy about Rust [1]!
> > It seems it probably needs some feedback from those who are familiar with
> > the Rust workflow. I'm not an expert, but let me leave some quick
> feedback.
> > This email is sent to the R-package-devel mailing list as well as to
> cran@~
> > so that we can publicly discuss.
> >
> > It seems most of the concern is about how to make the build
> deterministic.
> > In this regard, the policy should encourage including "Cargo.lock" file
> > [2]. Cargo.lock is created on the first compile, and the resolved
> versions
> > of dependencies are recorded. As long as this file exists, the dependency
> > versions are locked to the ones in this file, except when the package
> > author explicitly updates the versions.
> >
> > Cargo.lock also records the SHA256 checksums of the crates if they are
> from
> > crates.io, Rust's official crate registry. If the checksums don't match,
> > the build will fail with the following message:
> >
> >error: checksum for `foo v0.1.2` changed between lock files
> >
> >this could be indicative of a few possible errors:
> >
> >* the lock file is corrupt
> >* a replacement source in use (e.g., a mirror) returned a
> different
> > checksum
> >* the source itself may be corrupt in one way or another
> >
> >unable to verify that `foo v0.1.2` is the same as when the lockfile
> was
> > generated
> >
> > For dependencies from Git repositories, Cargo.lock records the commit
> > hashes. So, the version of the source code (not the version of the crate)
> > is uniquely determined. That said, unlike cargo.io, it's possible that
> the
> > commit or the Git repository itself has disappeared at the time of
> > building, which makes the build fail. So, it might be reasonable the CRAN
> > policy prohibits the use of Git dependency unless the source code is
> > bundled. I have no strong opinion here.
> >
> > Accordingly, I believe this sentence
> >
> >> In practice maintainers have found it nigh-impossible to meet these
> > conditions whilst downloading as they have too little control.
> >
> > is not quite true. More specifically, these things
> >
> >> The 

Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-12 Thread Hiroaki Yutani
I actually use cargo vendor.

https://github.com/yutannihilation/string2path/blob/main/src/rust/vendor.sh

One thing to note is that, prior to R 4.3.0, the vendored directories hit
the Windows' path limit so I had to put them into a TAR file. I haven't
tested on R 4.3.0, but probably this problem is solved by this improvement.
So, if you target only R >= 4.3, you can just cargo vendor.

https://blog.r-project.org/2023/03/07/path-length-limit-on-windows/index.html

Best,
Yutani

2023年7月13日(木) 11:50 Kevin Ushey :

> Package authors could use 'cargo vendor' to include Rust crate sources
> directly in their source R packages. Would that be acceptable?
>
> Presumedly, the vendored sources would be built using the versions
> specified in an accompanying Cargo.lock as well.
>
> https://doc.rust-lang.org/cargo/commands/cargo-vendor.html
>
>
> On Wed, Jul 12, 2023, 7:35 PM Simon Urbanek 
> wrote:
>
>> Yutani,
>>
>> I'm not quite sure your reading fully matches the intent of the policy.
>> Cargo.lock is not sufficient, it is expected that the package will provide
>> *all* the sources, it is not expected to use cargo to resolve them from
>> random (possibly inaccessible) places. So the package author is expected to
>> either include the sources in the package *or* (if prohibitive due to
>> extreme size) have a release tar ball available at a fixed, secure,
>> reliable location (I was recommending Zenodo.org for that reason - GitHub
>> is neither fixed nor reliable by definition).
>>
>> Based on that, I'm not sure I fully understand the scope of your proposal
>> for improvement. Carlo.lock is certainly the first step that the package
>> author should take in creating the distribution tar ball so you can fix the
>> versions, but it is not sufficient as the next step involves collecting the
>> related sources. We don't want R users to be involved in that can of worms
>> (especially since the lock file itself provides no guarantees of
>> accessibility of the components and we don't want to have to manually
>> inspect it), the package should be ready to be used which is why it has to
>> do that step first. Does that explain the intent better? (In general, the
>> downloading at install time is actually a problem, because it's not
>> uncommon to use R in environments that have no Internet access, but the
>> download is a concession for extreme cases where the tar balls may be too
>> big to make it part of the package, but it's yet another can of worms...).
>>
>> Cheers,
>> Simon
>>
>>
>>
>> > On 13/07/2023, at 12:37 PM, Hiroaki Yutani 
>> wrote:
>> >
>> > Hi,
>> >
>> > I'm glad to see CRAN now has its official policy about Rust [1]!
>> > It seems it probably needs some feedback from those who are familiar
>> with
>> > the Rust workflow. I'm not an expert, but let me leave some quick
>> feedback.
>> > This email is sent to the R-package-devel mailing list as well as to
>> cran@~
>> > so that we can publicly discuss.
>> >
>> > It seems most of the concern is about how to make the build
>> deterministic.
>> > In this regard, the policy should encourage including "Cargo.lock" file
>> > [2]. Cargo.lock is created on the first compile, and the resolved
>> versions
>> > of dependencies are recorded. As long as this file exists, the
>> dependency
>> > versions are locked to the ones in this file, except when the package
>> > author explicitly updates the versions.
>> >
>> > Cargo.lock also records the SHA256 checksums of the crates if they are
>> from
>> > crates.io, Rust's official crate registry. If the checksums don't
>> match,
>> > the build will fail with the following message:
>> >
>> >error: checksum for `foo v0.1.2` changed between lock files
>> >
>> >this could be indicative of a few possible errors:
>> >
>> >* the lock file is corrupt
>> >* a replacement source in use (e.g., a mirror) returned a
>> different
>> > checksum
>> >* the source itself may be corrupt in one way or another
>> >
>> >unable to verify that `foo v0.1.2` is the same as when the lockfile
>> was
>> > generated
>> >
>> > For dependencies from Git repositories, Cargo.lock records the commit
>> > hashes. So, the version of the source code (not the version of the
>> crate)
>> > is uniquely determined. That said, unlike cargo.io, it's possible that
>> the
>> > commit or the Git repository itself has disappeared at the time of
>> > building, which makes the build fail. So, it might be reasonable the
>> CRAN
>> > policy prohibits the use of Git dependency unless the source code is
>> > bundled. I have no strong opinion here.
>> >
>> > Accordingly, I believe this sentence
>> >
>> >> In practice maintainers have found it nigh-impossible to meet these
>> > conditions whilst downloading as they have too little control.
>> >
>> > is not quite true. More specifically, these things
>> >
>> >> The standard way to download a Rust ‘crate’ is by its version number,
>> and
>> > these have been changed without changing their number.
>> 

Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-12 Thread Simon Urbanek



> On 13/07/2023, at 2:50 PM, Kevin Ushey  wrote:
> 
> Package authors could use 'cargo vendor' to include Rust crate sources 
> directly in their source R packages. Would that be acceptable?
> 


Yes, that is exactly what was suggested in the original thread.

Cheers,
Simon



> Presumedly, the vendored sources would be built using the versions specified 
> in an accompanying Cargo.lock as well.
> 
> https://doc.rust-lang.org/cargo/commands/cargo-vendor.html
> 
> 
> On Wed, Jul 12, 2023, 7:35 PM Simon Urbanek  
> wrote:
> Yutani,
> 
> I'm not quite sure your reading fully matches the intent of the policy. 
> Cargo.lock is not sufficient, it is expected that the package will provide 
> *all* the sources, it is not expected to use cargo to resolve them from 
> random (possibly inaccessible) places. So the package author is expected to 
> either include the sources in the package *or* (if prohibitive due to extreme 
> size) have a release tar ball available at a fixed, secure, reliable location 
> (I was recommending Zenodo.org for that reason - GitHub is neither fixed nor 
> reliable by definition).
> 
> Based on that, I'm not sure I fully understand the scope of your proposal for 
> improvement. Carlo.lock is certainly the first step that the package author 
> should take in creating the distribution tar ball so you can fix the 
> versions, but it is not sufficient as the next step involves collecting the 
> related sources. We don't want R users to be involved in that can of worms 
> (especially since the lock file itself provides no guarantees of 
> accessibility of the components and we don't want to have to manually inspect 
> it), the package should be ready to be used which is why it has to do that 
> step first. Does that explain the intent better? (In general, the downloading 
> at install time is actually a problem, because it's not uncommon to use R in 
> environments that have no Internet access, but the download is a concession 
> for extreme cases where the tar balls may be too big to make it part of the 
> package, but it's yet another can of worms...).
> 
> Cheers,
> Simon
> 
> 
> 
> > On 13/07/2023, at 12:37 PM, Hiroaki Yutani  wrote:
> > 
> > Hi,
> > 
> > I'm glad to see CRAN now has its official policy about Rust [1]!
> > It seems it probably needs some feedback from those who are familiar with
> > the Rust workflow. I'm not an expert, but let me leave some quick feedback.
> > This email is sent to the R-package-devel mailing list as well as to cran@~
> > so that we can publicly discuss.
> > 
> > It seems most of the concern is about how to make the build deterministic.
> > In this regard, the policy should encourage including "Cargo.lock" file
> > [2]. Cargo.lock is created on the first compile, and the resolved versions
> > of dependencies are recorded. As long as this file exists, the dependency
> > versions are locked to the ones in this file, except when the package
> > author explicitly updates the versions.
> > 
> > Cargo.lock also records the SHA256 checksums of the crates if they are from
> > crates.io, Rust's official crate registry. If the checksums don't match,
> > the build will fail with the following message:
> > 
> >error: checksum for `foo v0.1.2` changed between lock files
> > 
> >this could be indicative of a few possible errors:
> > 
> >* the lock file is corrupt
> >* a replacement source in use (e.g., a mirror) returned a different
> > checksum
> >* the source itself may be corrupt in one way or another
> > 
> >unable to verify that `foo v0.1.2` is the same as when the lockfile was
> > generated
> > 
> > For dependencies from Git repositories, Cargo.lock records the commit
> > hashes. So, the version of the source code (not the version of the crate)
> > is uniquely determined. That said, unlike cargo.io, it's possible that the
> > commit or the Git repository itself has disappeared at the time of
> > building, which makes the build fail. So, it might be reasonable the CRAN
> > policy prohibits the use of Git dependency unless the source code is
> > bundled. I have no strong opinion here.
> > 
> > Accordingly, I believe this sentence
> > 
> >> In practice maintainers have found it nigh-impossible to meet these
> > conditions whilst downloading as they have too little control.
> > 
> > is not quite true. More specifically, these things
> > 
> >> The standard way to download a Rust ‘crate’ is by its version number, and
> > these have been changed without changing their number.
> >> Downloading a ‘crate’ normally entails downloading its dependencies, and
> > that is done without fixing their version numbers
> > 
> > won't happen if the R package does include Cargo.lock because
> > 
> > - if the crate is from crates.io, "the version can never be overwritten,
> > and the code cannot be deleted" there [3]
> > - if the crate is from a Git repository, the commit hash is unique in its
> > nature. The version of the crate might be the same 

Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-12 Thread Kevin Ushey
Package authors could use 'cargo vendor' to include Rust crate sources
directly in their source R packages. Would that be acceptable?

Presumedly, the vendored sources would be built using the versions
specified in an accompanying Cargo.lock as well.

https://doc.rust-lang.org/cargo/commands/cargo-vendor.html


On Wed, Jul 12, 2023, 7:35 PM Simon Urbanek 
wrote:

> Yutani,
>
> I'm not quite sure your reading fully matches the intent of the policy.
> Cargo.lock is not sufficient, it is expected that the package will provide
> *all* the sources, it is not expected to use cargo to resolve them from
> random (possibly inaccessible) places. So the package author is expected to
> either include the sources in the package *or* (if prohibitive due to
> extreme size) have a release tar ball available at a fixed, secure,
> reliable location (I was recommending Zenodo.org for that reason - GitHub
> is neither fixed nor reliable by definition).
>
> Based on that, I'm not sure I fully understand the scope of your proposal
> for improvement. Carlo.lock is certainly the first step that the package
> author should take in creating the distribution tar ball so you can fix the
> versions, but it is not sufficient as the next step involves collecting the
> related sources. We don't want R users to be involved in that can of worms
> (especially since the lock file itself provides no guarantees of
> accessibility of the components and we don't want to have to manually
> inspect it), the package should be ready to be used which is why it has to
> do that step first. Does that explain the intent better? (In general, the
> downloading at install time is actually a problem, because it's not
> uncommon to use R in environments that have no Internet access, but the
> download is a concession for extreme cases where the tar balls may be too
> big to make it part of the package, but it's yet another can of worms...).
>
> Cheers,
> Simon
>
>
>
> > On 13/07/2023, at 12:37 PM, Hiroaki Yutani  wrote:
> >
> > Hi,
> >
> > I'm glad to see CRAN now has its official policy about Rust [1]!
> > It seems it probably needs some feedback from those who are familiar with
> > the Rust workflow. I'm not an expert, but let me leave some quick
> feedback.
> > This email is sent to the R-package-devel mailing list as well as to
> cran@~
> > so that we can publicly discuss.
> >
> > It seems most of the concern is about how to make the build
> deterministic.
> > In this regard, the policy should encourage including "Cargo.lock" file
> > [2]. Cargo.lock is created on the first compile, and the resolved
> versions
> > of dependencies are recorded. As long as this file exists, the dependency
> > versions are locked to the ones in this file, except when the package
> > author explicitly updates the versions.
> >
> > Cargo.lock also records the SHA256 checksums of the crates if they are
> from
> > crates.io, Rust's official crate registry. If the checksums don't match,
> > the build will fail with the following message:
> >
> >error: checksum for `foo v0.1.2` changed between lock files
> >
> >this could be indicative of a few possible errors:
> >
> >* the lock file is corrupt
> >* a replacement source in use (e.g., a mirror) returned a
> different
> > checksum
> >* the source itself may be corrupt in one way or another
> >
> >unable to verify that `foo v0.1.2` is the same as when the lockfile
> was
> > generated
> >
> > For dependencies from Git repositories, Cargo.lock records the commit
> > hashes. So, the version of the source code (not the version of the crate)
> > is uniquely determined. That said, unlike cargo.io, it's possible that
> the
> > commit or the Git repository itself has disappeared at the time of
> > building, which makes the build fail. So, it might be reasonable the CRAN
> > policy prohibits the use of Git dependency unless the source code is
> > bundled. I have no strong opinion here.
> >
> > Accordingly, I believe this sentence
> >
> >> In practice maintainers have found it nigh-impossible to meet these
> > conditions whilst downloading as they have too little control.
> >
> > is not quite true. More specifically, these things
> >
> >> The standard way to download a Rust ‘crate’ is by its version number,
> and
> > these have been changed without changing their number.
> >> Downloading a ‘crate’ normally entails downloading its dependencies, and
> > that is done without fixing their version numbers
> >
> > won't happen if the R package does include Cargo.lock because
> >
> > - if the crate is from crates.io, "the version can never be overwritten,
> > and the code cannot be deleted" there [3]
> > - if the crate is from a Git repository, the commit hash is unique in its
> > nature. The version of the crate might be the same between commits, but a
> > git dependency is specified by the commit hash, not the version of the
> > crate.
> >
> > I'm keen to know what problems the CRAN maintainers have experienced 

Re: [R-pkg-devel] Feedback on "Using Rust in CRAN packages"

2023-07-12 Thread Simon Urbanek
Yutani,

I'm not quite sure your reading fully matches the intent of the policy. 
Cargo.lock is not sufficient, it is expected that the package will provide 
*all* the sources, it is not expected to use cargo to resolve them from random 
(possibly inaccessible) places. So the package author is expected to either 
include the sources in the package *or* (if prohibitive due to extreme size) 
have a release tar ball available at a fixed, secure, reliable location (I was 
recommending Zenodo.org for that reason - GitHub is neither fixed nor reliable 
by definition).

Based on that, I'm not sure I fully understand the scope of your proposal for 
improvement. Carlo.lock is certainly the first step that the package author 
should take in creating the distribution tar ball so you can fix the versions, 
but it is not sufficient as the next step involves collecting the related 
sources. We don't want R users to be involved in that can of worms (especially 
since the lock file itself provides no guarantees of accessibility of the 
components and we don't want to have to manually inspect it), the package 
should be ready to be used which is why it has to do that step first. Does that 
explain the intent better? (In general, the downloading at install time is 
actually a problem, because it's not uncommon to use R in environments that 
have no Internet access, but the download is a concession for extreme cases 
where the tar balls may be too big to make it part of the package, but it's yet 
another can of worms...).

Cheers,
Simon



> On 13/07/2023, at 12:37 PM, Hiroaki Yutani  wrote:
> 
> Hi,
> 
> I'm glad to see CRAN now has its official policy about Rust [1]!
> It seems it probably needs some feedback from those who are familiar with
> the Rust workflow. I'm not an expert, but let me leave some quick feedback.
> This email is sent to the R-package-devel mailing list as well as to cran@~
> so that we can publicly discuss.
> 
> It seems most of the concern is about how to make the build deterministic.
> In this regard, the policy should encourage including "Cargo.lock" file
> [2]. Cargo.lock is created on the first compile, and the resolved versions
> of dependencies are recorded. As long as this file exists, the dependency
> versions are locked to the ones in this file, except when the package
> author explicitly updates the versions.
> 
> Cargo.lock also records the SHA256 checksums of the crates if they are from
> crates.io, Rust's official crate registry. If the checksums don't match,
> the build will fail with the following message:
> 
>error: checksum for `foo v0.1.2` changed between lock files
> 
>this could be indicative of a few possible errors:
> 
>* the lock file is corrupt
>* a replacement source in use (e.g., a mirror) returned a different
> checksum
>* the source itself may be corrupt in one way or another
> 
>unable to verify that `foo v0.1.2` is the same as when the lockfile was
> generated
> 
> For dependencies from Git repositories, Cargo.lock records the commit
> hashes. So, the version of the source code (not the version of the crate)
> is uniquely determined. That said, unlike cargo.io, it's possible that the
> commit or the Git repository itself has disappeared at the time of
> building, which makes the build fail. So, it might be reasonable the CRAN
> policy prohibits the use of Git dependency unless the source code is
> bundled. I have no strong opinion here.
> 
> Accordingly, I believe this sentence
> 
>> In practice maintainers have found it nigh-impossible to meet these
> conditions whilst downloading as they have too little control.
> 
> is not quite true. More specifically, these things
> 
>> The standard way to download a Rust ‘crate’ is by its version number, and
> these have been changed without changing their number.
>> Downloading a ‘crate’ normally entails downloading its dependencies, and
> that is done without fixing their version numbers
> 
> won't happen if the R package does include Cargo.lock because
> 
> - if the crate is from crates.io, "the version can never be overwritten,
> and the code cannot be deleted" there [3]
> - if the crate is from a Git repository, the commit hash is unique in its
> nature. The version of the crate might be the same between commits, but a
> git dependency is specified by the commit hash, not the version of the
> crate.
> 
> I'm keen to know what problems the CRAN maintainers have experienced that
> Cargo.lock cannot solve. I hope we can help somehow to improve the policy.
> 
> Best,
> Yutani
> 
> [1]: https://cran.r-project.org/web/packages/using_rust.html
> [2]: https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
> [3]: https://doc.rust-lang.org/cargo/reference/publishing.html
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>