Mike and Dennis,

I have also experienced the slow checkout and was shocked when I saw over
400MB of data in the repo.  I agree that hosting the *.nc files outside of
the source code repo is the best solution to that issue, and the JPL
internal power points and docs really should be purged too.

If I can help out, let me know.


-Cam



On Mon, Mar 10, 2014 at 12:53 PM, Michael Joyce <[email protected]> wrote:

> I think that would be great Denis! I can go ahead and look at doing
> something similar for the other ocw/ocw-ui components as well. I'm sure
> this will help us out a good bit.
>
> Thanks!
>
>
> -- Joyce
>
>
> On Mon, Mar 10, 2014 at 11:20 AM, denis.nadeau <[email protected]
> >wrote:
>
> > Michael,
> >
> > I like the idea of having the NetCDF files in a external repository.
> >
> > I was thinking that it might be better to point the people to satellite
> > data at the different DAACs so that they can download the files directly.
> > That would work for the "obs4MIPs" program.     I would feel better about
> > it as well,   I have been worried to be told by some data providers
> (ECMWF)
> > that we are not authorized to distribute their original data.   I
> initially
> > did not think about this when I checked in my original code.
> >
> > I just found out that ECMWF now allows people to download their data in
> > "NetCDF" instead of "GRIB" using Python [1].   I tried it before, but
> could
> > only retrieve GRIB data and did not want to mess with "Grads" ctl files
> and
> > CDMS2/CDAT package.    So now, I could just create a script to download
> the
> > right files and rename them to the appropriate filenames for obs4MIPs
> > examples.
> >
> > I would feel much better about this.   Let me know what you think.
> >
> > [1] https://software.ecmwf.int/wiki/display/WEBAPI/Accessing+
> > ECMWF+data+servers+in+batch
> >
> > Denis
> >
> > On 3/10/14 1:06 PM, Michael Joyce wrote:
> >
> >> Hi guys,
> >>
> >> An unfortunate side effect of our export from SVN to Git is that we've
> >> ended up with a rather bloated repository. We've had a large number of
> >> binary files in our repo in the past and all of this has been rolled up
> >> into a obnoxious ~500 MB pack file. I've been completely unable to clone
> >> the repo on my home internet because it constantly times out and it's
> >> painfully slow on my faster work connection.
> >>
> >> To fix this problem I suggest we do the following:
> >> - Remove all binary files from our repo and host them externally. For
> >> example, NetCDF files can be downloaded when they're needed and cleaned
> up
> >> afterwards (for tests or examples).
> >> - Remove all the bloat from our pack file. I was digging through stuff
> >> earlier and found a number of very large and outdated files in our pack
> >> file (~300 MB NC file, internal JPL presentations/files from a long time
> >> ago, etc.). We should be able to use [1] to help automate this for us,
> >> although we can also take care of it on our own if need be.
> >>
> >> Let me know what you guys think the best course of action is. That being
> >> said, dealing with this sooner rather than later would be nice =D
> >>
> >> [1] https://github.com/cmaitchison/git_diet
> >>
> >> -- Joyce
> >>
> >>
> >
> > --
> > -----------------------------------------------------
> > Denis Nadeau, (CSC)
> > NCCS (NASA Center for Climate Simulation)
> > NASA Goddard Space Flight Center
> > Mailcode 606.2
> > 8800 Greenbelt Road
> > Greenbelt, MD 20771
> > Email: [email protected]
> > Phone: (301) 286-7286           Fax: 301.286.1634
> > -----------------------------------------------------
> >
> >
>

Reply via email to