I looked at this site some months back. Notes from my investigation then:

1. The content/jsons are being served by commercial GIS operated by BSNL
2. The content is coming back split across many JSONs (50-60) with encoded
URLs
3. API appeared stateful (URLs kept changing)

This, I concluded, needs a
(a) phantom-like programmable browser control
(b) network proxy/sniffer to dump all intercepted content
(c) wrangling to relate data from across all the jsons

I also believe that there are license issues.

On May 21, 2016 3:51 PM, "Raphael Susewind" <[email protected]>
wrote:

> Hi Nikhil,
>
> most likely the flash application loads something like a JSON (or CSV,
> if they are bad programmers ;-) ) from a specified API address. Use a
> network sniffer to intercept the traffic that the flashplayer generates,
> and see whether you can replicate the API.
>
> If you are lucky, you will see HTTP requests to an URL along the lines
> of http://schoolgis.nic.in/state_x/data.json?school=0000001 to
> 000014986. In that case, you can then manually scrape the JSON files (if
> need be by emulating a flashplayer's HTTP headers, though I doubt that
> they check for this).
>
> If you are unlucky, its a more complex API - some stateful frontends for
> SQL databases can be very nasty to replicate, for instance. One brute
> force kind of solution in such cases would be to write a custom proxy
> server (there are python/perl/... modules for this) - i.e. a kind of
> customized sniffer - and route your browser traffic through this, then
> automate the browser (again, there are plugins for firefox and chrome
> that have corresponding python or perl interfaces), and intercept the
> traffic generated. That's the solution I found to scrape polling station
> localities from the ECI server (before they put a bold copyright
> disclaimer on it - now this kind of scraping would probably be illegal -
> so do check these issues as well).
>
> Let us know what you find out about the API,
>
> Best of luck,
> Raphael
>
> On 21.05.2016 11:43, Nikhil VJ wrote:
>
> > Hi friends,
> >
> > is their any way to extract data from such a flash player platform...as
> > follows...
> >
> > schoolgis.nic.in <http://schoolgis.nic.in/>
> >
> > --regards,
> > Nikhil VJ
> > Pune
> >
> > --
> > Datameet is a community of Data Science enthusiasts in India. Know more
> > about us by visiting http://datameet.org
> > ---
> > You received this message because you are subscribed to the Google
> > Groups "datameet" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to [email protected]
> > <mailto:[email protected]>.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> Dr Raphael Susewind | Associate, Contemporary South Asia Studies, Oxford
>          Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
>       Web & Twitter | https://www.raphael-susewind.de | @RaphaelSusewind
>              Impact | https://impactstory.org/raphael-susewind
>
> Please consider https://www.gnupg.org for encryption (key id 10AEE42F)
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to