Is your target text in text form or image form?

If text: http://tabula.technology/

If image: http://www.i2ocr.com/free-online-kannada-ocr

If image with handwritten text : interns :P

--
Cheers,
Nikhil
+91-966-583-1250
Pune, India
Self-designed learner at Swaraj University <http://www.swarajuniversity.org>
http://nikhilsheth.blogspot.in




On Wed, Aug 19, 2015 at 9:40 AM, Shree D N <shre...@oorvani.in> wrote:

> We downloaded it all manually. However unable to merge them. The text is
> in Kannada, and may be unclear even if we are able to extract it. We have
> anyway merged some data manually, from these 190+ files to existing data we
> had already prepared.
> Digitizing it all starting from the affidavit filing stage would have
> helped us greatly but sadly BBMP or EC or SEC doesn't have that system.
>
> On 18 August 2015 at 22:03, Bhanu Kamapantula <talk2k...@gmail.com> wrote:
>
>> Hi Shree,
>>
>> You would want to write a script which can scrape the data from the
>> website. This can be automated using Python, mechanize library (with
>> support for doPostBack calls as in these webpages).
>>
>> Once downloaded, PDFs can be combined using PDFtk library (one among
>> different methods). Then, XPDF might be useful to retrieve text from the
>> combined PDF.
>>
>> best,
>> Bhanu
>>
>> On Tue, Aug 18, 2015 at 4:01 AM, Shree D N <shre...@oorvani.in> wrote:
>>
>>> This link has form 7 (list of candidates) for all contesting candidates
>>> for BBMP polls 2015. Typically all pdfs, scanned and uploaded. Language:
>>> Kannada
>>> http://117.247.176.82/
>>> Is there a way to download it all and merge into one document or a table
>>> that represents the list of all candidates for all wards??
>>> We are trying to put this together because this consolidated list is not
>>> available anywhere as far as I know. Has anyone else seen it?
>>> --
>>> -------
>>> Cheers,
>>>
>>> *Shree | Associate Editor | *
>>> *Oorvani Foundation**Citizen Matters
>>> <http://bangalore.citizenmatters.in> - Bangalore's own online news magazine*
>>> Bangalore | Tel: +91-80-4173 7584 | Mobile: +91-95909 35559
>>> Follow us on Twitter <https://twitter.com/citizenmatters> | Follow us
>>> on Facebook <https://www.facebook.com/citizenmatters>
>>>
>>> --
>>> Datameet is a community of Data Science enthusiasts in India. Know more
>>> about us by visiting http://datameet.org
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to datameet+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Bhanu
>>
>> --
>> Datameet is a community of Data Science enthusiasts in India. Know more
>> about us by visiting http://datameet.org
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to datameet+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> -------
> Cheers,
>
> *Shree | Associate Editor | *
> *Oorvani Foundation**Citizen Matters <http://bangalore.citizenmatters.in>
> - Bangalore's own online news magazine*
> Bangalore | Tel: +91-80-4173 7584 | Mobile: +91-95909 35559
> Follow us on Twitter <https://twitter.com/citizenmatters> | Follow us on
> Facebook <https://www.facebook.com/citizenmatters>
>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to