Re: [datameet] Comprehensive list of GOI online services

2017-05-17 Thread konark modi
Thank you @srinivas  @vaishnavi for your feedback.

I have also uploaded a .tsv file :
https://raw.githubusercontent.com/konarkmodi/DigitalIndia/master/data/list_domains.tsv



I totally like the idea of crowdsourcing. How do you want to proceed ?

1. Issue a PR with changes in the TSV or open it as an issue ?
2. If there is a source which needs to be scraped then open it as an Issue ?
3. Use this repo as the main source or move it somewhere more open maybe
datameet repo ?

If you know of any other resources let me know, will pull them in.

-Konark
@konarkmodi

On Wed, May 17, 2017 at 5:26 AM, Vaishnavi Jayakumar (Inclusive India) <
vaishnavi.jayaku...@inclusiveindia.info> wrote:

> Yes please to the crowdsourcing!
>
> Mammoth task - this itself is 10741. (And more popping up all the time. )
>
> Old one that's missing for eg =  araiindia.com
> New one that's not been updated = sci.gov.in
>
> When o when are they going to be updated to reflect the gov.in default?
> When o when will we stop seeing gmail ids for government work by govt
> officials?
>
> ---
> *VAISHNAVI JAYAKUMAR*
> http://about.me/vjayakumar
>
> On Wed, May 17, 2017 at 7:35 AM, srinivas kodali 
> wrote:
>
>> Unfortunately this is not all the websites. There are more which are not
>> part of directory. We should probably start crowdsourcing the others.
>>
>> Regards,
>> Srinivas Kodali
>> www.lostprogrammer.com
>>
>> On Wed, May 17, 2017 at 1:40 AM, konark modi 
>> wrote:
>>
>>> Hi All,
>>>
>>> I am always looking for a comprehensive list of GOI websites in a
>>> consumable manner for various projects. Hence I decided to scrape
>>> http://goidirectory.nic.in/index.php. (YES! There is not HTTPS for this
>>> link).
>>>
>>> I have dumped a list of websites: https://raw.githubus
>>> ercontent.com/konarkmodi/DigitalIndia/master/data/list_domains.json
>>>
>>> *Number of Websites:* 10741
>>> Suffix Count
>>> .gov.in 4805
>>> .nic.in 2766
>>> .org 855
>>> .com 566
>>> .ac.in 499
>>> .in 485
>>> .co.in 209
>>> .org.in 176
>>> .res.in 158
>>> .edu.in 110
>>> .net 37
>>> .edu 26
>>> .net_in 9
>>> .info 7
>>> .aero 2
>>> .gen_in 1
>>> .coop 1
>>>
>>>
>>> Hope this list is useful for quite some projects / studies.
>>>
>>> Please feel free to add missing domains, or other information which
>>> would be relevant, the working repo is: https://github.com/konarkm
>>> odi/DigitalIndia
>>>
>>>
>>> -Konark
>>> @konarkmodi
>>>
>>> --
>>> Datameet is a community of Data Science enthusiasts in India. Know more
>>> about us by visiting http://datameet.org
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to datameet+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> Datameet is a community of Data Science enthusiasts in India. Know more
>> about us by visiting http://datameet.org
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to datameet+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Comprehensive list of GOI online services

2017-05-16 Thread Vaishnavi Jayakumar (Inclusive India)
Yes please to the crowdsourcing!

Mammoth task - this itself is 10741. (And more popping up all the time. )

Old one that's missing for eg =  araiindia.com
New one that's not been updated = sci.gov.in

When o when are they going to be updated to reflect the gov.in default?
When o when will we stop seeing gmail ids for government work by govt
officials?

---
*VAISHNAVI JAYAKUMAR*
http://about.me/vjayakumar

On Wed, May 17, 2017 at 7:35 AM, srinivas kodali 
wrote:

> Unfortunately this is not all the websites. There are more which are not
> part of directory. We should probably start crowdsourcing the others.
>
> Regards,
> Srinivas Kodali
> www.lostprogrammer.com
>
> On Wed, May 17, 2017 at 1:40 AM, konark modi 
> wrote:
>
>> Hi All,
>>
>> I am always looking for a comprehensive list of GOI websites in a
>> consumable manner for various projects. Hence I decided to scrape
>> http://goidirectory.nic.in/index.php. (YES! There is not HTTPS for this
>> link).
>>
>> I have dumped a list of websites: https://raw.githubus
>> ercontent.com/konarkmodi/DigitalIndia/master/data/list_domains.json
>>
>> *Number of Websites:* 10741
>> Suffix Count
>> .gov.in 4805
>> .nic.in 2766
>> .org 855
>> .com 566
>> .ac.in 499
>> .in 485
>> .co.in 209
>> .org.in 176
>> .res.in 158
>> .edu.in 110
>> .net 37
>> .edu 26
>> .net_in 9
>> .info 7
>> .aero 2
>> .gen_in 1
>> .coop 1
>>
>>
>> Hope this list is useful for quite some projects / studies.
>>
>> Please feel free to add missing domains, or other information which would
>> be relevant, the working repo is: https://github.com/konarkm
>> odi/DigitalIndia
>>
>>
>> -Konark
>> @konarkmodi
>>
>> --
>> Datameet is a community of Data Science enthusiasts in India. Know more
>> about us by visiting http://datameet.org
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to datameet+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Comprehensive list of GOI online services

2017-05-16 Thread srinivas kodali
Unfortunately this is not all the websites. There are more which are not
part of directory. We should probably start crowdsourcing the others.

Regards,
Srinivas Kodali
www.lostprogrammer.com

On Wed, May 17, 2017 at 1:40 AM, konark modi  wrote:

> Hi All,
>
> I am always looking for a comprehensive list of GOI websites in a
> consumable manner for various projects. Hence I decided to scrape
> http://goidirectory.nic.in/index.php. (YES! There is not HTTPS for this
> link).
>
> I have dumped a list of websites: https://raw.githubusercontent.com/
> konarkmodi/DigitalIndia/master/data/list_domains.json
>
> *Number of Websites:* 10741
> Suffix Count
> .gov.in 4805
> .nic.in 2766
> .org 855
> .com 566
> .ac.in 499
> .in 485
> .co.in 209
> .org.in 176
> .res.in 158
> .edu.in 110
> .net 37
> .edu 26
> .net_in 9
> .info 7
> .aero 2
> .gen_in 1
> .coop 1
>
>
> Hope this list is useful for quite some projects / studies.
>
> Please feel free to add missing domains, or other information which would
> be relevant, the working repo is: https://github.com/
> konarkmodi/DigitalIndia
>
>
> -Konark
> @konarkmodi
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.