Hi,
And as Peter said, it looks difficult to scrape. There’s a recaptcha.
Regards,
Sanjay

On Tue, 10 Nov 2020 at 07:43 [email protected] <[email protected]> wrote:

> Hi Nikhil,
> Here's a pdf I put together a while ago which gives some indication of the
> deeply nested nature of the e-courts website (it's very similar to what
> Sanjay has already posted, but dives a bit deeper). If one opens a case in
> a new window, the original search is still available to return to, but
> otherwise, each case entails a fresh search.
> And yes, I did mean KrutiDev. I found a very useful site which coverts to
> Unicode: https://www.fontconverter.in/hindi.php?q=Krutidev-to-Unicode
> best wishes,
> Peter
>
> On Monday, November 9, 2020 at 6:50:14 PM UTC+10:30 [email protected]
> wrote:
>
>> Hi Peter,
>>
>> Can you share a sample instruction (click this -> click that) or link on
>> how to reach a place on the website where we can see a listing under the
>> IPC code?
>>
>> <digressing>
>> About KritiDev - do you mean KrutiDev?
>>
>> There's converters available now to convert from legacy ascii fonts
>> (where we would use a custom font to make A's glyph look like one akshar
>> and B look like another akshar and so on) to unicode (where different
>> languages have their own char code and co-exist).
>>
>> I found various websites on searching online for "hindi to unicode
>> converter", but also there's this open source collection of htmls contain
>> javascripts that I have used to work with earlier:
>> https://sites.google.com/site/hindifontconverters/files. Has simple web
>> page files with javascripts to do the conversions.
>>
>> A budget document I was working with 5 yrs back had its own version of
>> legacy font - I hacked into one javascript here, added in new mappings and
>> customised my own converter.
>>
>> Sorry to digress but just sharing in case the legacy font thing was being
>> a blocker to anyone. Also if someone wants to build a full solution out of
>> this that takes say word docs and converts to unicode without losing
>> formatting and can bring in some resources - let me know. I didn't have the
>> skills to programmatically work with office docs 5 yrs ago; I do now.
>>
>> And there was one surprise finding related to this: I've found that
>> legacy fonts survive the journey through pdfs better than unicode. So if an
>> institution insists on sharing documents as pdf, I'd rather have them stick
>> to their old legacy fonts and use one of these converter tools at my end to
>> get the text out into unicode.
>>
>> --
>> Cheers,
>> Nikhil VJ
>> https://nikhilvj.co.in
>>
>> On Mon, Nov 9, 2020 at 12:41 PM [email protected] <[email protected]>
>> wrote:
>>
>>> Hi Lovish,
>>> My experience (only for district courts in MP) is that scraping is *not*
>>> possible. It *is* possible to look for all cases in which a specific
>>> IPC offence is involved (e.g. 376(D) Gang rape). But to find out what
>>> happened in each case, you must go to each *seriatim*, check what
>>> decision was made by the court and--if you're lucky--access the judgement
>>> made in the case. In MP, those judgements are in Hindi, rendered in
>>> KritiDev.
>>> I've written a paper looking at some rape cases. Feel free to contact me
>>> directly.
>>> best wishes,
>>> Peter Mayer
>>> On Monday, November 9, 2020 at 1:35:09 PM UTC+10:30 Lovish Sharma wrote:
>>>
>>>> Hi,
>>>>
>>>> I am working as an associate for an NGO working in the field of crimes
>>>> against women. Currently I am doing research on crimes against women in
>>>> prominent cities. For that, I need to scrap the data from e-courts 
>>>> website, *https://districts.ecourts.gov.in/
>>>> <https://districts.ecourts.gov.in/>* .
>>>>
>>>> Kindly help me with that.
>>>>
>>> --
>>>
>> Datameet is a community of Data Science enthusiasts in India. Know more
>>> about us by visiting http://datameet.org
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/datameet/c9029026-4bf0-464b-ada8-6c4964911afen%40googlegroups.com
>>> <https://groups.google.com/d/msgid/datameet/c9029026-4bf0-464b-ada8-6c4964911afen%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/datameet/80ae69d8-525c-4300-af06-7eb305f6bd73n%40googlegroups.com
> <https://groups.google.com/d/msgid/datameet/80ae69d8-525c-4300-af06-7eb305f6bd73n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/CAOWzc8AtQEX6PoDGcHZbdC5fEA1N%2Bx%3D6uoSXZopKxZc6-LO%3DfA%40mail.gmail.com.

Reply via email to