Thank you for re-raising DVC. I hadn't looked seriously at it, but it
seems a nice balance between git LFS and git annex. I'm recommending
that our team look at it for potential inclusion in Gigantum.

While I'm at it, I think now is also a good time to pitch the
graphical tool we're building: https://gigantum.com

I'm a little hesitant, but I've checked in-person with a few deeply
invested folks in the community and it seems reasonable to "advertise"
this on the list - our client is open source (and always will be) and
we are hoping we can build a sustainable model where scientists and
educators can use the thing for free (much like GitHub has done -
though I understand not everyone loves GitHub either).

We are still in beta, but we've put together something that we believe
does a good job of managing git and docker, along with a cloud
synchronization back-end. While this may rankle more experienced
coders, I think it can potentially empower folks who won't (or can't)
learn the whole set of data / software management skills. Part of
being in Beta is that the feature set is still somewhat open and I'd
truly love to get input from other carpentry instructors on this. A
concrete idea I have is that you could save 1/4 of the time if you
don't need to teach command line git, and then perhaps you could get
more students to the point that they wrote good functions.

Triggered by this thread... data management piece is particularly of
interest. Currently, we have a relatively coarse set of options - each
project is scoped to a docker-bind-mounted folder on your host OS.
There are "input" and "output" directories there managed via git LFS
by default. However, you can also disable tracking of these
directories and then manage them however you like (you shouldn't
currently use a strategy that extends the existing git repo like doing
LFS yourself - though now that I think of it, manual git annex MIGHT
work... I'll have to check when I get some time). One of the major
hobgoblins is the windows filesystem, of course... and we could
potentially eliminate that by shifting to docker volumes instead of
bind-mounts (but then you lose Host OS access).

Kicking the tires is super easy via the demo server link, and all you
need to use it locally is download the electron GUI or alternatively
install a pip package. It would be great to get input on what would be
valuable and if folks would be interested in talking about using this
in workshops (or just developing resources around transparent and open
science strategies and tools), please let me know! I will of course
support anyone who is interested in working with Gigantum and would
love to run some workshops in partnership with some other folks (we're
currently working on an open/reproducible neuro workshop with folks at
Stanford and Columbia - so if anyone is interested in that
specifically, please let me know soon!).

Best,
Dav

ps - To be clear, it's super-easy to walk away from Gigantum.
Everything is in a git repo, and the Dockerfile is usable (with a bit
of work - which I'd be happy to walk people through) outside of the
platform.
On Thu, Aug 2, 2018 at 11:10 AM <[email protected]> wrote:
>
> Since this thread was highlighted in yesterday's Carpentry Clippings, I'll 
> bet I'm not the last to jump in today, so I'll be brief.
>
> DVC was mentioned at the beginning, but I gather few here have given it a 
> try. I encourage you to take a look. The tool is still in alpha, but 
> developing quickly with a lot of potential. What I like about DVC:
>
> Works in parallel to git and is similar to git LFS in cloning/pushing/pulling 
> references to data files
> Data files are not tracked by git; your code repository remains just that
> Supports external data sources (since 0.10.0); do you really want a copy of 
> your data *within* every repo that reads it?
> Supports multiple cloud data sources (e.g. Amazon S3)
> Does not default to "publishing" data on GitHub. GitHub is no Dataverse or 
> Figshare (... data discoverability, yada yada)
> It's a makefile alternative too!
>
>
> The Carpentries / discuss / see discussions + participants + delivery options 
> Permalink

------------------------------------------
The Carpentries: discuss
Permalink: 
https://carpentries.topicbox.com/groups/discuss/Tb776978a905c0bf8-M6ec28f85bc59ba4ad6e66d6b
Delivery options: https://carpentries.topicbox.com/groups/discuss/subscription

Reply via email to