great find Yalamber. i looked at src its in js and i think it should be 
easy to modify it eliminate need for much of manual correction. I can do व 
- wa , ी दीर्घ इकार - i , remove . for ं अनुस्वार and lower case 
everything. the modified version is attached.
 
please suggest what to put for 
ङ  (currently ~N, would 'ng' do?) and ञ (~n, 'yn' is near enough) 

with case lowered and . removed from anuswar, न ँ ं ण all map to  'n', 
which is okay i guess. 

and something out of this topic(may be it would be better to start off a 
different thread for this), I was working on transliterating input for 
Nepali (basically reverse of this). Some usage examples 
here<https://github.com/sapradhan/ne-rom-translit/wiki/Usage-examples>. I 
only know how to implement this for Linux and early implementation is 
here<http://nepalitankan.blogspot.com/>. 
Can you provide some feedback and suggestions on the usage patterns ?

thanks 
santosh 

On Friday, November 15, 2013 9:24:41 PM UTC+5:45, ytamot wrote:
>
> And... there you have it for Dhanusa... Polling Center Eng is in 
> Roman(only ASCII chars) - using the conversion-to-ITRANS tool I mentioned 
> previously, plus manual editing a bit. I did not bother changing double 
> a(s) "aa" to single, as I find distinction between अ(a) and आ(aa) necessary 
> to make names unambiguous in many cases.
>
> Basically, manual-editing part consisted of replacing "ee" with "i", "vaa" 
> with "wa", sometimes end of the word consonants had अ(a) suppressed but 
> needed it, the tool added a period in words with anuswara - don't need it, 
> ITRANS uses capital letters for some consonants - lower cased it, and small 
> things like that. Finally, title cased the entire names. RegExp replace 
> could perhaps be scripted for many of these rules - but human eye still may 
> be needed.
>
> All in all, not too bad for a combination of programmatic conversion and 
> human editing which otherwise would have taken quite a lot of time doing it 
> manually.
>
> yālu
>
>
> On Fri, Nov 15, 2013 at 7:00 PM, Yalamber Tamot <[email protected]<javascript:>
> > wrote:
>
>> Hi Santosh,
>>
>> I agree IAST takes getting used to - it is too academic for the masses. 
>> While not entirely ideal, ITRANS scheme may be better. It turns out there 
>> is a utility to convert Devanagari into ITRANS - it runs on the browser and 
>> can be downloaded 
>> here<https://docs.google.com/uc?id=0B3QLKzA0EHYWYTg4MTExYWItM2JhZC00YzQyLTkyOTEtNjhkMWE3MjFiODYz&export=download&hl=en>.
>>  
>> With a little modification with rules specific to Nepali language, it could 
>> work wonders transliterating Devanagari back to Roman.
>>
>> yālu
>>
>>
>> On Thu, Nov 14, 2013 at 11:25 PM, sapradhan <[email protected]<javascript:>
>> > wrote:
>>
>>> yalu,
>>> I found IAST a bit difficult to read, may be it takes some time to get 
>>> used to. Besides it uses characters not present in normal keyboards so 
>>> would not vouch for it. ITRANS should be easier for most of us to 
>>> understand and adopt, perhaps there is something that translates devanagari 
>>> to ITRANS too ? 
>>>
>>> anjesh,
>>> i am waiting on your call on whether we are repeating the 
>>> conversion/scrubbing process again with PCS mapping 
>>> OR resume manually correcting the entries.
>>>
>>> I have compiled required mappings PCS to unicode, somebody should be 
>>> able to plug this into 2utf8 so that conversion is done correctly.
>>>
>>>
>>> On Thursday, November 14, 2013 10:34:33 PM UTC+5:45, ytamot wrote:
>>>
>>>> Hi all,
>>>>
>>>> Attempted Bhojpur and Dhanusa.
>>>>
>>>> Duplicate rows are identifiable as well.
>>>>
>>>> Polling Center Eng is straight forward Devanagari to 
>>>> IAST<http://en.wikipedia.org/wiki/International_Alphabet_of_Sanskrit_Transliteration>conversion
>>>>  - not sure if that's going to work out for this. If you are 
>>>> wondering about the Devanagari to IAST conversion tool 
>>>> this<http://devtransliteration.appspot.com/translit>one is pretty 
>>>> accurate. IAST is a scheme to romanize devanagari, popular 
>>>> among Sanskrit academics worldwide.
>>>>
>>>> Please ignore the ward no.s in Bhojpur - I did my own scraping off pdf 
>>>> and, ward no.s seemed important.
>>>>
>>>> thanks,
>>>>
>>>> yālu
>>>>
>>>>
>>>> On Thu, Nov 14, 2013 at 9:26 AM, sapradhan <[email protected]> wrote:
>>>>
>>>>>  anjesh 
>>>>> it is turning out to be lot more work than originally thought. 
>>>>> most of the issues is due to PCS Nepali being used, i have created a 
>>>>> sed script to automate PCS 2 Preeti which is attached. This script is NOT 
>>>>> foolproof and messes up any numerals if present(eg in Kathmandu there are 
>>>>> ward no.s in poll center names), there is also conflicts with ञ and ङ, 
>>>>> (PCS 
>>>>> has ङ at ~). 
>>>>> I did Baglung-Morang with this script results are better but still 
>>>>> need manual verification, please have a look  
>>>>>
>>>>> Few characters that need to be corrected manually are
>>>>> ज्ञ missing
>>>>> Bara/Salyan appearing in roman
>>>>> ह्ये as in गुह्येश्वरी
>>>>>
>>>>> thanks
>>>>>
>>>>>
>>>>> On Wednesday, November 13, 2013 11:16:42 PM UTC+5:45, anjesh wrote:
>>>>>
>>>>>> Santosh, 
>>>>>> That's correct. Those names must have been missed during the scraping 
>>>>>> process and our eyes have also missed those. The number and name of 
>>>>>> booths 
>>>>>> are also maintained in a database. Currently we are more concerned with 
>>>>>> the 
>>>>>> incorrect polling-center names only, we will fix those in the database 
>>>>>> (that's why the id is there in the first column) and share the corrected 
>>>>>> ones, along with the number of booths in opendatanepal.org. 
>>>>>>
>>>>>> Thanks
>>>>>> Anjesh.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 13 November 2013 23:05, sapradhan <[email protected]> wrote:
>>>>>>
>>>>>>> anjan, 
>>>>>>> i am assuming that you have maintained no of booths in a center 
>>>>>>> somewhere.
>>>>>>>  
>>>>>>> there are some issues like 
>>>>>>> इलाममा 
>>>>>>> सडक कार्यालय फिक्कल को ठाउँमा फिक्कल मात्रै 
>>>>>>> भवनी प्रा वि पञ्चकन्या को ठाउँमा पञ्चकन्या मात्रै
>>>>>>> पाँचथरमा पनि कतै कता यस्तै 
>>>>>>> i have added the fullnames there, can you verify that I am doing 
>>>>>>> thing correctly ? 
>>>>>>>
>>>>>>> thanks
>>>>>>>
>>>>>>> On Wednesday, November 13, 2013 10:35:28 PM UTC+5:45, anjesh wrote:
>>>>>>>
>>>>>>>> No we are not merging them. नेसुम क, ख, ग, घ are polling booths 
>>>>>>>> and नेसुम is polling center. Once we correct the polling center, 
>>>>>>>> booth names could be corrected easily. Booths are listed under center 
>>>>>>>> name 
>>>>>>>> e.g. http://election.opennepal.net:8000/#/constituency/11 like
>>>>>>>>
>>>>>>>> धूर्काे६ गा.वि.स. भवन, देउराली
>>>>>>>>
>>>>>>>>    - "धूर्काे६ गा.वि.स. भवन, देउराली(क)" 
>>>>>>>>    - "धूर्काे६ गा.वि.स. भवन, देउराली(ख)"
>>>>>>>>    - "धूर्काे६ गा.वि.स. भवन, देउराली(ग)"
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 13 November 2013 22:30, sapradhan <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> one confusion are we merging क ख ग into one.
>>>>>>>>> eg in taplejung 2 there are 4 नेसुम क, ख, ग, घ, are we merging 
>>>>>>>>> them ? 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wednesday, November 13, 2013 10:17:09 PM UTC+5:45, anjesh wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Santosh. Write access is enabled now - missed that :)
>>>>>>>>>>
>>>>>>>>>> ttf-2-unicode looks useful. Perhaps someone from the community 
>>>>>>>>>> would like to peek into it. 
>>>>>>>>>>
>>>>>>>>>> Anjesh
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 13 November 2013 21:32, Santa Basnet <[email protected]>wrote:
>>>>>>>>>>
>>>>>>>>>>> This link could be useful for your data conversion.
>>>>>>>>>>>
>>>>>>>>>>> http://nepalinlp.blogspot.com/2010/09/few-years-back-ttf-to-
>>>>>>>>>>> unicode.html
>>>>>>>>>>>
>>>>>>>>>>> In,
>>>>>>>>>>> Santa Basnet
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Nov 13, 2013 at 9:05 PM, sapradhan 
>>>>>>>>>>> <[email protected]>wrote:
>>>>>>>>>>>
>>>>>>>>>>>> nice initiative, 
>>>>>>>>>>>>
>>>>>>>>>>>> I looked into the pdf and it seems that there are two fonts in 
>>>>>>>>>>>> use Preeti and PCS Nepali. It turns out PCS Nepali has different 
>>>>>>>>>>>> keymapping 
>>>>>>>>>>>> than Preeti specifically the numeric layer ie ^‌=ट ,  &=ठ  ,  *=ड 
>>>>>>>>>>>> and so 
>>>>>>>>>>>> forth which is causing quite a few errors. If it is feasible 
>>>>>>>>>>>> change the 
>>>>>>>>>>>> mapping based on the font being used, the conversion should be 
>>>>>>>>>>>> better. 
>>>>>>>>>>>>
>>>>>>>>>>>> If it would be quicker to do this manually I can help. I dont 
>>>>>>>>>>>> have write access to google docs, please do the needful 
>>>>>>>>>>>> thanks
>>>>>>>>>>>> santosh
>>>>>>>>>>>>
>>>>>>>>>>>> On Wednesday, November 13, 2013 6:19:09 PM UTC+5:45, prawesh 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you look at this constituency http://election.o
>>>>>>>>>>>>> pennepal.net:8000/#/constituency/39, there are lots of issues 
>>>>>>>>>>>>> with unicode, which should not have been if the data were 
>>>>>>>>>>>>> available in 
>>>>>>>>>>>>> proper format. We scraped the data (in Nepali Preeti font) from 
>>>>>>>>>>>>> http://www.election.gov.np/oldecn/NP/pollinglist/dist_c
>>>>>>>>>>>>> onst_list.html. The task was not easy, 
>>>>>>>>>>>>> https://github.com/foss-np/2utf8 was used to convert Preeti 
>>>>>>>>>>>>> to Unicode. There could be problems in either conversion as well 
>>>>>>>>>>>>> as 
>>>>>>>>>>>>> scraping. We think that it might be quick to get help from the 
>>>>>>>>>>>>> community to 
>>>>>>>>>>>>> resolve these issues. For ease, we have maintained all the 
>>>>>>>>>>>>> scraped polling 
>>>>>>>>>>>>> centers district-wise in the google docs 
>>>>>>>>>>>>> here<https://docs.google.com/a/yipl.com.np/spreadsheet/ccc?key=0AhWLpToTogBwdDR4WWNuQkZMa0I0S0dXQjlCT3pISlE&usp=drive_web#gid=77>.
>>>>>>>>>>>>>  
>>>>>>>>>>>>> We plan to release these data in opendatanepal.org as well 
>>>>>>>>>>>>> but after resolving those issues, for which we seek your help. 
>>>>>>>>>>>>>
>>>>>>>>>>>>> District-file 
>>>>>>>>>>>>> sheet<https://docs.google.com/a/yipl.com.np/spreadsheet/ccc?key=0AhWLpToTogBwdDR4WWNuQkZMa0I0S0dXQjlCT3pISlE&usp=drive_web#gid=78>
>>>>>>>>>>>>>  lists 
>>>>>>>>>>>>> the districts and appropriate polling-list pdf file (from which 
>>>>>>>>>>>>> we 
>>>>>>>>>>>>> scraped). And the corresponding district page has all the polling 
>>>>>>>>>>>>> centers 
>>>>>>>>>>>>> (with issues, there might be repetitions as well). We have 
>>>>>>>>>>>>> created columns 
>>>>>>>>>>>>> for corrected center name and romanized center name. So you could 
>>>>>>>>>>>>> correct 
>>>>>>>>>>>>> the center names and add in case of missing centers. 
>>>>>>>>>>>>> Summary<https://docs.google.com/a/yipl.com.np/spreadsheet/ccc?key=0AhWLpToTogBwdDR4WWNuQkZMa0I0S0dXQjlCT3pISlE&usp=drive_web#gid=77>
>>>>>>>>>>>>>  shows 
>>>>>>>>>>>>> the list of districts with issues and corrected name - the google 
>>>>>>>>>>>>> script 
>>>>>>>>>>>>> will run and update the numbers there. For e.g, Taplejung 
>>>>>>>>>>>>> district 
>>>>>>>>>>>>> page<https://docs.google.com/a/yipl.com.np/spreadsheet/ccc?key=0AhWLpToTogBwdDR4WWNuQkZMa0I0S0dXQjlCT3pISlE&usp=drive_web#gid=2>seems
>>>>>>>>>>>>>  to have 3-4 issues in names. Thank you so much for your help.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -- 
>>>>>>>>>>>>> With Regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Prawesh Shrestha 
>>>>>>>>>>>>>
>>>>>>>>>>>>  -- 
>>>>>>>>>>>> -- 
>>>>>>>>>>>> FOSS Nepal mailing list: [email protected]
>>>>>>>>>>>> http://groups.google.com/group/foss-nepal
>>>>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>>>>>
>>>>>>>>>>>>  
>>>>>>>>>>>> Mailing List Guidelines: http://wiki.fossnepal.org/inde
>>>>>>>>>>>> x.php?title=Mailing_List_Guidelines
>>>>>>>>>>>> Community website: http://www.fossnepal.org/
>>>>>>>>>>>>  
>>>>>>>>>>>> --- 
>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>> Google Groups "FOSS Nepal" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>>
>>>>>>>>>>>> For more options, visit https://groups.google.com/grou
>>>>>>>>>>>> ps/opt_out.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> Santa B. Basnet
>>>>>>>>>>> Department of Computer Science & Engineering
>>>>>>>>>>> Nepal Engineering College
>>>>>>>>>>> Changunarayan, Bhaktapur, Nepal
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  -- 
>>>>>>>>>>> -- 
>>>>>>>>>>> FOSS Nepal mailing list: [email protected]
>>>>>>>>>>> http://groups.google.com/group/foss-nepal
>>>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>>>>
>>>>>>>>>>>  
>>>>>>>>>>> Mailing List Guidelines: http://wiki.fossnepal.org/inde
>>>>>>>>>>> x.php?title=Mailing_List_Guidelines
>>>>>>>>>>> Community website: http://www.fossnepal.org/
>>>>>>>>>>>  
>>>>>>>>>>> --- 
>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>> Google Groups "FOSS Nepal" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>
>>>>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>  -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "opendatanepal" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>>
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>
>>>>
>>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "opendatanepal" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] <javascript:>.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>
>

-- 
-- 
FOSS Nepal mailing list: [email protected]
http://groups.google.com/group/foss-nepal
To unsubscribe, e-mail: [email protected]

Mailing List Guidelines: 
http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
Community website: http://www.fossnepal.org/

--- 
You received this message because you are subscribed to the Google Groups "FOSS 
Nepal" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

<<< text/html; charset=UTF-16LE; name="Devanagari to iTrans Converter_02.htm": Unrecognized >>>

Reply via email to