Mike and Dennis, I have also experienced the slow checkout and was shocked when I saw over 400MB of data in the repo. I agree that hosting the *.nc files outside of the source code repo is the best solution to that issue, and the JPL internal power points and docs really should be purged too.
If I can help out, let me know. -Cam On Mon, Mar 10, 2014 at 12:53 PM, Michael Joyce <[email protected]> wrote: > I think that would be great Denis! I can go ahead and look at doing > something similar for the other ocw/ocw-ui components as well. I'm sure > this will help us out a good bit. > > Thanks! > > > -- Joyce > > > On Mon, Mar 10, 2014 at 11:20 AM, denis.nadeau <[email protected] > >wrote: > > > Michael, > > > > I like the idea of having the NetCDF files in a external repository. > > > > I was thinking that it might be better to point the people to satellite > > data at the different DAACs so that they can download the files directly. > > That would work for the "obs4MIPs" program. I would feel better about > > it as well, I have been worried to be told by some data providers > (ECMWF) > > that we are not authorized to distribute their original data. I > initially > > did not think about this when I checked in my original code. > > > > I just found out that ECMWF now allows people to download their data in > > "NetCDF" instead of "GRIB" using Python [1]. I tried it before, but > could > > only retrieve GRIB data and did not want to mess with "Grads" ctl files > and > > CDMS2/CDAT package. So now, I could just create a script to download > the > > right files and rename them to the appropriate filenames for obs4MIPs > > examples. > > > > I would feel much better about this. Let me know what you think. > > > > [1] https://software.ecmwf.int/wiki/display/WEBAPI/Accessing+ > > ECMWF+data+servers+in+batch > > > > Denis > > > > On 3/10/14 1:06 PM, Michael Joyce wrote: > > > >> Hi guys, > >> > >> An unfortunate side effect of our export from SVN to Git is that we've > >> ended up with a rather bloated repository. We've had a large number of > >> binary files in our repo in the past and all of this has been rolled up > >> into a obnoxious ~500 MB pack file. I've been completely unable to clone > >> the repo on my home internet because it constantly times out and it's > >> painfully slow on my faster work connection. > >> > >> To fix this problem I suggest we do the following: > >> - Remove all binary files from our repo and host them externally. For > >> example, NetCDF files can be downloaded when they're needed and cleaned > up > >> afterwards (for tests or examples). > >> - Remove all the bloat from our pack file. I was digging through stuff > >> earlier and found a number of very large and outdated files in our pack > >> file (~300 MB NC file, internal JPL presentations/files from a long time > >> ago, etc.). We should be able to use [1] to help automate this for us, > >> although we can also take care of it on our own if need be. > >> > >> Let me know what you guys think the best course of action is. That being > >> said, dealing with this sooner rather than later would be nice =D > >> > >> [1] https://github.com/cmaitchison/git_diet > >> > >> -- Joyce > >> > >> > > > > -- > > ----------------------------------------------------- > > Denis Nadeau, (CSC) > > NCCS (NASA Center for Climate Simulation) > > NASA Goddard Space Flight Center > > Mailcode 606.2 > > 8800 Greenbelt Road > > Greenbelt, MD 20771 > > Email: [email protected] > > Phone: (301) 286-7286 Fax: 301.286.1634 > > ----------------------------------------------------- > > > > >
