To avoid creation of recursively folder follow below steps
1. Create one Folder in your Local drive i created as "*/home/sujit/Desktop/Data/*" 2. Create below script and run for i in {1901..2012} do cd */home/sujit/Desktop/Data/* wget -r --no-parent --reject "index.html*" http://ftp3.ncdc.noaa.gov/pub/data/noaa/$i/ done On Fri, Nov 16, 2012 at 1:01 PM, Sujit Dhamale <sujitdhamal...@gmail.com>wrote: > Hi, > If Needed you can run Below Script for Storing Data on your Local System > > for i in {1901..2012} > do > cd /home/ubuntu/work/ > wget -r -np -nH .cut-dirs=3 -R index.html > http://ftp3.ncdc.noaa.gov/pub/data/noaa/$i/ > cd pub/data/noaa/$i/ > cp *.gz /home/ubuntu/work/files > cd /home/ubuntu/work/ > rm -r pub/ > done > > > > On Mon, Feb 13, 2012 at 3:43 PM, Andy Doddington <a...@doddington.net>wrote: > >> OK, well for starters, I think you can safely ignore the PDF data; to >> paraphrase Star Wars" “that isn’t the data >> in which you are interested”. >> >> Page 16 of the book describes the data format and refers to a data store >> that contains directories for each year from >> 1901 to 2001. It also shows the naming of .gz files within a sample >> directory (1990). The files in this directory have >> names "010010-99999-1990.gz", "010014-99999-1990.gz", >> "010015-99999-1990.gz", and so on… >> >> Referring back to the NCDC web site, at the link below ( >> http://www.ncdc.noaa.gov) and clicking on the ‘Free Data’ >> link on the left-hand side of the screen beings up a new screen, as shown >> below: >> >> >> Clicking again on the ‘Free Data’ link in the middle section of this page >> brings up another page, listing the available >> data sets: >> >> >> As this page notes, although some of this data needs to be paid for, >> there is at least one ‘free’ options within >> each section. For simplicity, I went for the first one - the one labelled >> “3505 FTP data access” - which the comment >> says is free. I used anonymous FTP and found that this site contained >> directories for each year from 1901 to 2012. >> I expect the additional directories reflect the fact that time has moved >> on since the book was written :-) >> >> There are also several text or pdf files that provide further information >> on the contents of the site. I suggest you >> read some of these to get more details. One of these is called >> "ish-format-document.pdf" and it seems to describe >> the document format in some detail. If you open this, you can check >> whether it matches the formate expected by >> the hadoop sample code. There is also a ‘software’ directory, which >> contains various bits of code that might >> prove useful. >> >> On drilling down into the directory for 1990, I get the following list of >> files: >> >> >> Which looks close enough to the the file names in the hadoop book - I’d >> guess that these are the correct files. >> >> Given the passage of time, it is still possible that the file format has >> changed to make it incompatible with the >> hadoop code. However, it shouldn’t be that difficult to modify the code >> to suit the new format (which is very >> well documented, as already noted). >> >> Good luck! >> >> Andy >> >> —————————————— >> >> On 12 Feb 2012, at 08:50, Bing Li wrote: >> >> Andy, >> >> Since there is a lot of data on the free data of the site, I cannot figure >> out which one is the one talked in the book. Any format differences might >> cause the source code to get exceptions. Some data is even in PDF format! >> >> Thanks so much! >> Bing >> >> On Sun, Feb 12, 2012 at 4:35 PM, Andy Doddington <a...@doddington.net >> >wrote: >> >> According to Page 15 of the book, this data is available from the US >> >> National Climatic Data Center, at >> >> http://www.ncdc.noaa.gov. Once you get to this site, there is a menu of >> >> links on the left-hand side of the >> >> page, listed under the heading ‘Data & Products’. I suspect that the entry >> >> labelled ‘Free Data’ is the most >> >> likely area you need to investigate :-) >> >> >> Good Luck >> >> >> Andy D >> >> >> ———————————————————— >> >> >> On 12 Feb 2012, at 07:14, Bing Li wrote: >> >> >> Dear all, >> >> >> I am following the book, Hadoop: the Definitive Guide. However, I got >> >> stuck >> >> because I could not get the NCDC Weather data that is used by the source >> >> code in the book. The Appendix C told me I could follow some instructions >> >> in www.hadoopbook.com. But I didn't get the instructions there. Could >> >> you >> >> give me a hand? >> >> >> Thanks so much! >> >> >> Best regards, >> >> Bing >> >> >> >> >> >