Data from the Antyodaya mission is already available on the Indiadataportal 
site... Not sure if that is complete or not - but you can check.

On Friday, 11 February 2022 at 05:34:36 UTC+5:30 [email protected] wrote:

> Hello Seniors 
> From where i can get the city shap fie of Tarapur, Aurangabad, Nashik  
> this all  cities are in Maharashtra State 
>
>
> Uzair
>
> On Mon, Feb 7, 2022 at 12:42 PM Piyush Kumar <[email protected]> wrote:
>
>> Thank you Sanjay and Nikhil. I think these are good starting points to 
>> try and figure out how to get this done and I am sure with some time and 
>> effort, it is possible.
>>
>> Piyush
>>
>> On Sun, 6 Feb 2022 at 17:48, Nikhil VJ <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I don't think Selenium is required - this looks like it can be done with 
>>> just varying the request payload of one POST api call.
>>> POST api call to URL: 
>>> https://missionantyodaya.nic.in/preloginVillageInfrastructureReports2020.html
>>> the POST request content type is application/x-www-form-urlencoded 
>>>
>>> at *state level*, request payload is like:
>>> stateCode: 27
>>> stateName: MAHARASHTRA
>>> districtCode: 
>>> districtName: 
>>> blockCode: 
>>> blockName: 
>>> gpCode: 
>>> gpName: 
>>>
>>> It* district level* it becomes:
>>> stateCode: 27
>>> stateName: MAHARASHTRA
>>> districtCode: 469
>>> districtName: AURANGABAD
>>> blockCode: 
>>> blockName: 
>>> gpCode: 
>>> gpName: 
>>>
>>> then *block level*:
>>> stateCode: 27
>>> stateName: MAHARASHTRA
>>> districtCode: 469
>>> districtName: AURANGABAD
>>> blockCode: 4315
>>> blockName: KHULTABAD
>>> gpCode: 
>>> gpName: 
>>>
>>> then* GP level:*
>>> stateCode: 27
>>> stateName: MAHARASHTRA
>>> districtCode: 469
>>> districtName: AURANGABAD
>>> blockCode: 4315
>>> blockName: KHULTABAD
>>> gpCode: 170584
>>> gpName: BODKHA
>>>
>>> If in python, one can use Beautifulscrape to capture the table data as 
>>> well as get the (code + name) pairs for the next level.
>>>
>>> --
>>> Cheers,
>>> Nikhil VJ
>>> https://nikhilvj.co.in
>>>
>>>
>>> On Fri, Feb 4, 2022 at 1:42 PM Sanjay Bhangar <[email protected]> 
>>> wrote:
>>>
>>>> Piyush -
>>>>
>>>> You could write a python (or your preferred language) script that just 
>>>> requests the HTML, parses it, and follows the hierarchy, without using 
>>>> selenium. This could be a bunch of work as the site doesn't use regular 
>>>> links with GET requests, but rather when you click on a state in the 
>>>> table, 
>>>> it uses Javascript to fill up hidden form fields with the state code, etc. 
>>>> and then does a form submit, causing a POST request to be made with those 
>>>> values.
>>>>
>>>> For eg. you can see the links in the table have an onClick handler like 
>>>> "selectState(2,'HIMACHAL 
>>>> PRADESH','preloginDistrictInfrastructureReports2020.html')" .
>>>>
>>>> Then, in the javascript, you can see the selectState function defined 
>>>> like so:
>>>>
>>>> function selectState(stateCode,stateName,action){  
>>>>    $("#stateCode").val(stateCode); 
>>>>    $("#stateName").val(stateName); 
>>>>    $("#reportForm").attr('action', action);
>>>>    $("#reportForm").submit(); 
>>>>
>>>> }
>>>>
>>>> In this JS file: 
>>>> https://missionantyodaya.nic.in/resources/antyodaya/js/custom/prelogin/reports/preloginReport.js
>>>>
>>>> So this will make a POST request to 
>>>> preloginDistrictInfrastructureReports2020.html
>>>> with stateCode=2, stateName=HIMACHAL PRADESH
>>>>
>>>> Similarly, there are different onCick handlers defined for selecting 
>>>> districts, etc. that you can follow down to see what URLs they are calling 
>>>> with what parameters. And in theory, you could write some HTML parsing 
>>>> code 
>>>> and some regex to go through the items in each table, parse out the 
>>>> parameters and URLs to call, and follow things down. 
>>>>
>>>> So, in theory you could write this without mucking around with 
>>>> selenium, but it also seems like a lot more work than if the site was 
>>>> structured "normally" with unique URLs and GET requests. 
>>>>
>>>> For the page numbering, this seems okay: the HTML outputs all the items 
>>>> across all the pages, and then the actual pagination on the page is purely 
>>>> client-side javascript - so if you were to read the HTML on the page via 
>>>> python or so, you would just get all the items in the table without having 
>>>> to worry about pagination.
>>>>
>>>> Unfortunately, this does seem like a lot of work and I don't really 
>>>> have the time to do anything, but it seemed like an interesting problem 
>>>> and 
>>>> I was curious so I took a look. Hope it could help a bit.
>>>>
>>>> All the best,
>>>> Sanjay
>>>>
>>>> On Fri, Feb 4, 2022 at 1:03 PM Piyush Kumar <[email protected]> 
>>>> wrote:
>>>>
>>>>> Could folks here suggest how to go about this? 
>>>>>
>>>>>
>>>>> https://missionantyodaya.nic.in/preloginStateInfrastructureReports2020.html
>>>>>
>>>>> When we click this link, we get data on village-level infrastructure 
>>>>> put within multiple HTML tables across many pages (separated into state, 
>>>>> dist., block etc.)
>>>>>
>>>>> Suppose I want to scrape data upto the village level for a particular 
>>>>> state, is there any way I can get it done without too much back and forth 
>>>>> over Selenium webdriver? Please note that to access village level data 
>>>>> you 
>>>>> have to go through a nested hierarchy of links (gram panchyt within 
>>>>> block, 
>>>>> which is within a district and so on). To make matters more complicated, 
>>>>> the pages have also not been numbered. 
>>>>>
>>>>> Can someone in the know help me figure this out?
>>>>>
>>>>> Thanks in advance
>>>>> Piyush
>>>>>
>>>>> -- 
>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>> more about us by visiting http://datameet.org
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "datameet" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/datameet/CAFtOtdujRhq36O4SW%3Dtie%2BSDH_6Pq1R87B6nVerzU4giQVka%3Dw%40mail.gmail.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/datameet/CAFtOtdujRhq36O4SW%3Dtie%2BSDH_6Pq1R87B6nVerzU4giQVka%3Dw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>>> about us by visiting http://datameet.org
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "datameet" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/datameet/CAG3W7ZE475WmeyR6Y9uXhKNh%3DLL7%3DhCwgeCjZ_fciEdWcfR_pA%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/datameet/CAG3W7ZE475WmeyR6Y9uXhKNh%3DLL7%3DhCwgeCjZ_fciEdWcfR_pA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>> about us by visiting http://datameet.org
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/datameet/CAH7jeuNzEB%3DUVqgG0mYVtrKjWTHeAdN6d_%3DFnz9LLCsE4QH1eA%40mail.gmail.com
>>>  
>>> <https://groups.google.com/d/msgid/datameet/CAH7jeuNzEB%3DUVqgG0mYVtrKjWTHeAdN6d_%3DFnz9LLCsE4QH1eA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> -- 
>> Datameet is a community of Data Science enthusiasts in India. Know more 
>> about us by visiting http://datameet.org
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/datameet/CAFtOtduoUWJ6aQH69XfmUgnxXuQoJ1bRRMb1u-2Kznja9cSCtg%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/datameet/CAFtOtduoUWJ6aQH69XfmUgnxXuQoJ1bRRMb1u-2Kznja9cSCtg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/de5ff487-915d-4e4a-9830-cd3487535e5cn%40googlegroups.com.

Reply via email to