Unfortunately this is not all the websites. There are more which are not part of directory. We should probably start crowdsourcing the others.
Regards, Srinivas Kodali www.lostprogrammer.com On Wed, May 17, 2017 at 1:40 AM, konark modi <modi.kon...@gmail.com> wrote: > Hi All, > > I am always looking for a comprehensive list of GOI websites in a > consumable manner for various projects. Hence I decided to scrape > http://goidirectory.nic.in/index.php. (YES! There is not HTTPS for this > link). > > I have dumped a list of websites: https://raw.githubusercontent.com/ > konarkmodi/DigitalIndia/master/data/list_domains.json > > *Number of Websites:* 10741 > Suffix Count > .gov.in 4805 > .nic.in 2766 > .org 855 > .com 566 > .ac.in 499 > .in 485 > .co.in 209 > .org.in 176 > .res.in 158 > .edu.in 110 > .net 37 > .edu 26 > .net_in 9 > .info 7 > .aero 2 > .gen_in 1 > .coop 1 > > > Hope this list is useful for quite some projects / studies. > > Please feel free to add missing domains, or other information which would > be relevant, the working repo is: https://github.com/ > konarkmodi/DigitalIndia > > > -Konark > @konarkmodi > > -- > Datameet is a community of Data Science enthusiasts in India. Know more > about us by visiting http://datameet.org > --- > You received this message because you are subscribed to the Google Groups > "datameet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to datameet+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.