On Tue, Feb 18, 2014 at 2:59 PM, Raphael Susewind <[email protected] > wrote:
> Hey everybody, > > I am working on PDF electoral rolls, but struggle with unicode > conversion issues (a Crystal Reports bug in the version the ECI > currently uses, at least in some states such as UP or Gujarat, which > leads to a corrupted ToUnicodeCMap, which means you cannot properly copy > and paste from the PDF, or otherwise extract proper UTF8). If your 'free > the pdf event' finds a way around this, do let me know - likewise I > shall send any progress from my side... > For generating list of Polling Booths, I gave up parsing Kannada PDFs and used the polling booth names specified in Kannada on the website. I've transliterated the names using unidecode python library and replaced some common words. For example: http://ge2014.anandology.com/KA/AC001 Anand -- For more details about this list http://datameet.org/discussions/ --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
