thanks for starting this discussion. Personally, I would prefer keeping them in the repo, even if partially outdated during development cycles. However, building SystemDS should always pack these binaries into the self-contained SystemDS.jar to avoid unnecessary friction and unexpected behavior. While we can download them from somewhere during the build, it would require updating that external download source, and some robust build integration such that builds can still be done offline.

Regards,
Matthias

On 2/25/2021 2:03 PM, Mark Dokter wrote:
Hi!

Frequent changing binaries in a git repo are a pain and a known problem
[1]. Every change I make to the native code would require changed
binaries. What's even more annoying is booting into Windows or Linux to
also provide the binaries for "the other" (depending on where I work
atm). The latter is actually a good thing because it forces me to check
if everything's working out on the other platform frequently.

There are several methods/projects dealing with binaries [2]. I didn't
thoroughly investigate but maybe we'll have something like that in the
future.

The requirement to have binaries is there because we want to have easy
access to native and gpu operations for our users. Going by that reason
I argue that users would take a binary release to start out with
SystemDS. This is why I suggest to reduce the publishing of binary files
to release zips/tars.

With an up to date documentation (mental note to myself), developers who
work with the sources directly can be expected to be able to install
dependencies and use some simple steps with cmake (yes we can even put
that in a script).

rfc,
Mark


[1]
https://stackoverflow.com/questions/4697216/is-git-good-with-binary-files

[2]
https://stackoverflow.com/questions/540535/managing-large-binary-files-with-git/29530784#29530784

Reply via email to