RE: IranL10nInfo

2004-04-29 Thread C Bobroff

On Thu, 29 Apr 2004, Linguasoft wrote:

> It's very easy to type Tajik using a "Phonetic" (i.e., mnemonic) Cyrillic
> keyboard.

With which font though? I could only find hacked fonts.

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: IranL10nInfo

2004-04-29 Thread Linguasoft
Dear Connie (et al),

It's very easy to type Tajik using a "Phonetic" (i.e., mnemonic) Cyrillic
keyboard. I wrote a Keyman keyboard driver for Kazakh that should include
all those Cyrillic fancy characters needed for Tajik. Want to try it?

Best regards,

Peter E. Hauer
Linguasoft

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: IranL10nInfo

2004-04-29 Thread C Bobroff
On Thu, 29 Apr 2004, Behdad Esfahbod wrote:

> Perhaps we should add Tajik vs. Tajiki to the list of wars ;).

Good idea!
Merriam-Webster even has "Irani" as an English word in case you need more
suggestions for your list.

I'm sticking with the Oxford English Dictionary...

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: IranL10nInfo

2004-04-29 Thread Behdad Esfahbod
On Thu, 29 Apr 2004, C Bobroff wrote:

> On Thu, 29 Apr 2004, Roozbeh Pournader wrote:
>
> > For example, Tajiki is written in the Cyrillic alphabet instead of
> > Arabic. ;)
>
> [1] The English word is Tajik (and sometimes Tadzhik) but not Tajiki.  (I
> also only found this out recently!)

I guess Tajik is more correct.  While Tajik is listed in
Merriam-Webster at m-w.com, but in their Indo-Europian languages
chart they have named it Tajiki:

http://m-w.com/mw/table/indoeuro.htm

Perhaps we should add Tajik vs. Tajiki to the list of wars ;).

> -Connie

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Days of the Week abbreviated

2004-04-29 Thread Behdad Esfahbod

The main problem with the fonts right now are:

  * The lack line height data.
  * The size does not match of the MS fonts. (so not to the
Latin fonts).
  * A few of fonts have bitmaps added.  Those bitmaps should be
removed.
  * There's a known problem on LCDs, but that's another story
perhaps.

So, as you can see, the third item is trivial to fix, the fourth
is not that important, and the first two are easy to fix.  After
that we can talk about mark positioning and other fancy
characteristics.

behdad

PS, Behnam:  So this was the list of bugs in the fonts you asked
me to list.  Waiting for the fix.

On Thu, 29 Apr 2004, C Bobroff wrote:

> On Thu, 29 Apr 2004, Roozbeh Pournader wrote:
>
> > Not much has happened with the fonts since last year (1382), and the
> > latest version is 0.4. BTW, we need volunteers for tracking bugs in the
> > fonts.
>
> Sorry to hear that. Can you release the latest if there have been any
> improvements?  Maybe I could post them on my website and say the "price"
> is one bug report per download!
> On the other hand, one does not like to distribute a lot of beta
> fonts into the system which could result in chaos. That's why I usually
> just send people to Borna still.
> -Connie

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: IranL10nInfo

2004-04-29 Thread Omid K. Rad
<>


Hello every body, especially my friends at FarsiWeb,

I'm trying to point out some things here (even though you might already
know) about .NET and our project.

For your information:

The .NET Common Language Infrastructure (CLI) and the C# programming
language were submitted to ECMA and ISO/IEC International
standardization organizations a couple of years ago. The submissions
were ratified as standards after thorough investigations as:

Standard ECMA-334 (C#)
http://www.ecma-international.org/publications/standards/Ecma-334.htm

Standard ECMA-335 (CLI)
http://www.ecma-international.org/publications/standards/Ecma-335.htm

Standard ISO/IEC 23270 (C#)
http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=36
768

Standard ISO/IEC 23271 (CLI)
http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=36
769

This resulted in raising many new open source movements over .NET in the
ICT community, amongst which there are three major projects by third
parties that intend to implement versions of the .NET Framework
conforming to the base implementations that Microsoft has done or is
already underway. Those are:

The Ximian's Mono Project sponsored by UNIX
http://www.go-mono.com

Free Software Foundation's Portable .NET
http://www.dotgnu.org/pnet.html

Corel's Rotor (Microsoft SSCLI) for FreeBSD
http://msdn.microsoft.com/net/sscli


All of these implementations are published under noncommercial
shared-source licenses. This means we will have .NET applications
running on a vast number of platforms quite soon, to name a handful:
Linux, Windows, Solaris, FreeBSD, HP-UX, and Mac OS X. We have also a
choice of more than 20 programming languages to choose from: APL, COBOL,
Component Pascal, Eiffel, Fortran, Haskell, Jscript.NET, Mercury,
Oberon, Pascal, Perl, Python, Smalltalk, Visual Basic.NET, C# , Managed
C++, etc.

To make applications more interoperable between different platforms, all
of the implementations of CLI consider implementing the fundamental
namespaces in the .NET Framework Class Library that reflect closely to
what Microsoft releases. These don't include namespaces such as
Microsoft.*, yet include those that are referred to as pure .NET
namespaces which System.Globalization namespace is one of them.

The System.Globalization is also available in .NET Compact Framework - a
lighter version of the framework that installs on handheld devices.

In the "Iran Localization Info for Microsoft .NET" project (IranL10nInfo
for short) we have selected to work only on those parts of .NET that are
in the System.Globalization namespace (pure .NET). Any changes that
Microsoft mekes on them are indirectly ported to every non-Microsoft
implementations of the Class Library.

Moreover, this project will automatically produce a good layout of
information fields that we can simply use for other languages like Tajik
and Afghan.


So, we are trying to resolve some locale issues far beyond Microsoft - a
big name.



All the best,
Omid
__
  Iran Localization Info for Microsoft .NET
http://www.idevcenter.com/projects/iranl10ninfo/draft/


Other Open Source developments over ECMA CLI:

Intel Lab's OCL (Open CLI Library)
http://sourceforge.net/projects/ocl/

Platform.NET
http://sourceforge.net/projects/platformdotnet/



Articles:

Linux World - Bringing the CLI to Open Source (Article)
http://www.linuxworld.com/story/39216.htm?DE=1

Devx - Peeking under the Lid of Open Source .NET CLI Implementations
http://www.devx.com/devx/article/9725



Microsoft Open Source:

MSDN - ECMA Standardization
http://msdn.microsoft.com/net/ecma/

MSDN - The Common Language Infrastructure (CLI)
http://msdn.microsoft.com/netframework/using/understanding/cli

Microsoft Share Source Home Page:
http://www.microsoft.com/sharedsource/


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: IranL10nInfo

2004-04-29 Thread Omid K. Rad
Dear Behdad, Roozbeh, Connie,
 
Thanks for your replies and explaining me. First of all, Iâm sorry if you found my 
last post antagonistic in anyway. Iâm not expecting FarsiWeb anything more than what 
they are doing (I donât see myself in that stance either).

All I wanted to say is donât avoid something just because you guess it might not be 
of your taste (or it can simply be my conception only). I am not signifying working on 
Microsoft platform at all. I am specifically calling to .NET as a technology which is 
a world standard right now, and we are noticing mistakes in it pertaining Persian and 
Iran.

*Please go on to my next post for my explanations.*


And thank you very much for the locale info you provided to us. I'm sure I could never 
find people anywhere else as useful for our work as those I'm finding here. I hope we 
can do a good job with your help.

Regards,
Omid  ->>


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Farsi Stemming Algorithm

2004-04-29 Thread Jon D.
--- Ehsan Akhgari <[EMAIL PROTECTED]> wrote:

> I downloaded this package, and looked into it.  It
> seem to be useful for my
> job.  However, this is the first time I'm hearing of
> PC-Kimmo, so I was kind
> of lost when trying to figure out the whole thing. 
> I was wonderring if you
> can provide me with some additional info (or URLs;
> didn't find any myself)
> about this software, 

It's a two-level morphology engine, so basically it
resolves a surface form to a lexical form, or lexical
to surface form.  For example, if I give it a
newspaper word like 'nmiAim'
(نميايم -- I am
not coming), it will resolve to 'n+mi+A+m', taking
into account any morpheme boundary changes (like the
yeh here).  More documentation is found here [1].

> especially how can it be used
> on Linux in batch mode.
> Does PC-Kimmo come with any callable C interface?

One of the things that drives me nuts about the
software is that it claims to run on Solaris/Sparc,
Win/x86, MacOS, or BSD, but apparently no Linux (I
have a Sparc box, so I'm lucky :-).  The source code
is downloadable, but it currently doesn't seem to
compile on Linux/x86.  It does have a callable C
interface, as documented in the kimmolib.txt in this
file [2].  In fact, I'm working on an AI program that
calls PC-Kimmo to do morphology.  Batch mode is used
via the 'take' command, and using a .tak file.

Don't be too disappointed about version 0.5 of the
Persian implementation -- it was released 2 years ago
;-)  I've reworked almost every aspect of it since
then, so hopefully it will work better.
Have fun.

-Jon D.


[1] http://www.sil.org/pckimmo/
[2] ftp://ftp.sil.org/software/unix/pc-parse-doc-20030321.tgz




__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: BBC Persian on Internet and the Persian Language

2004-04-29 Thread C Bobroff

On Thu, 29 Apr 2004, Roozbeh Pournader wrote:

> http://www.bbc.co.uk/persian/interactivity/debate/story/2004/04/040428_mf_bt_weblanguage.shtml

Can you give an example of "haa-ye havvaz instead of kasra." I can't
think how that situation could come up although I'm sure it's obvious.

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: IranL10nInfo

2004-04-29 Thread C Bobroff
On Thu, 29 Apr 2004, Roozbeh Pournader wrote:

> For example, Tajiki is written in the Cyrillic alphabet instead of
> Arabic. ;)

Yeah, well, since I found out you can't actually type it unless you
buy those stand-alone programs (without the source code!), I'm
going to cite the Tajik [1] example every time people suggest Persian
script should be "reformed" and written in Latin chars because it's less
headaches. How easy or hard to implement seems to depend only on how much
interest there is, not on technical hurdles.
(Yes, I just read the BBC article Roozbeh mentioned!)

[1] The English word is Tajik (and sometimes Tadzhik) but not Tajiki.  (I
also only found this out recently!)

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: PersianComputing Digest, Vol 11, Issue 15

2004-04-29 Thread C Bobroff

On Thu, 29 Apr 2004, Masoud Sharbiani wrote:

> Not only that, but also you are screwing my mailer's threaded mail reading.
> Please don't do that.

I'm sure it was merely a subconscious attempt to seek out the
perfect abbreviation for Dushanbe, *Monday* Bazaar and capital of
Tajikistan :)

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Days of the Week abbreviated

2004-04-29 Thread C Bobroff
On Thu, 29 Apr 2004, Roozbeh Pournader wrote:

> Not much has happened with the fonts since last year (1382), and the
> latest version is 0.4. BTW, we need volunteers for tracking bugs in the
> fonts.

Sorry to hear that. Can you release the latest if there have been any
improvements?  Maybe I could post them on my website and say the "price"
is one bug report per download!
On the other hand, one does not like to distribute a lot of beta
fonts into the system which could result in chaos. That's why I usually
just send people to Borna still.
-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Abbreviations et al.

2004-04-29 Thread Roozbeh Pournader
Nice examples of abbreviations/shorthands/whatever:

* The first page of Mosahab Persian Encyclopedia (first published in
1345/1966), about the abbreviations used in the encyclopedia, showing
different methods of Persian abbreviation (127 KiB):

http://www.farsiweb.info/misc/mosahab-abbr.png

* A month table from a "sar-resid-naame" (I don't know the English term)
published in Iran in 1383/2004, showing the one-letter day headings (37
KiB):

http://www.farsiweb.info/misc/calendar-abbr.png

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: PersianComputing Digest, Vol 11, Issue 15

2004-04-29 Thread Masoud Sharbiani
On Wed, Apr 28, 2004 at 07:00:31AM -0700, C Bobroff wrote:
> The problem here is that you're receiving the Daily Digest form of the
> list so you're mixing and matching two different topics.  Possibly
> three with the Outlook question that also crept in.

Not only that, but also you are screwing my mailer's threaded mail reading.
Please don't do that.
Masoud

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Days of the Week abbreviated

2004-04-29 Thread Roozbeh Pournader
On Wed, 2004-04-28 at 09:06, C Bobroff wrote:
> OK, but kindly don't involve Roozbeh in any  flamefests until AFTER he's
> done with the fonts.

Not much has happened with the fonts since last year (1382), and the
latest version is 0.4. BTW, we need volunteers for tracking bugs in the
fonts.

As for me, I've been busy with the Academy stuff, specifications for
Persian locale information and collation, and committee work for the
FarsiLinux Technical Committee.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


BBC Persian on Internet and the Persian Language

2004-04-29 Thread Roozbeh Pournader
There is a debate story by BBC Persian on the Internet and the Persian
Langauge here:

http://www.bbc.co.uk/persian/interactivity/debate/story/2004/04/040428_mf_bt_weblanguage.shtml

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: IranL10nInfo

2004-04-29 Thread Roozbeh Pournader
On Wed, 2004-04-28 at 20:05, C Bobroff wrote:
> > About your suggestion, however, we (i.e. our team) have no idea about
> > Afghan and Tajik languages.
> It's all one language, different conventions.

For example, Tajiki is written in the Cyrillic alphabet instead of
Arabic. ;)

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


FarsiWeb and its mission (was RE: IranL10nInfo)

2004-04-29 Thread Roozbeh Pournader
On Wed, 2004-04-28 at 11:40, Omid K. Rad wrote:
> I was rather disappointed when I was told that FarsiWeb is
> not interested in Microsoft .NET technology at all. Even though I value
> all the great achievements that FarsiWeb has found, I personally believe
> that resolving Persian computing issues should not be selective,
> especially for a group that has nationally accepted this mission

The FarsiWeb Project is a research project funded by Sharif FarsiWeb,
Inc (a private company) and a few sponsors [1], with a very very limited
budget and personnel. Why is that that you think it should resolve *all*
Persian computing issues?

Individual members of FarsiWeb also represent High Council of
Informatics of Iran in the Unicode Consortium and are active in a few
other national and international organizations. But the group has not
ever been assigned any responsibility apart from its certain limited
contracts. In other terms, we have not nationally accepted any mission,
and we do not even get any funds from the Iranian government for
continuing to represent them in the Unicode Consortium and
ISO/IEC JTC1/SC2.

That aside, we would love to contribute to proper implementation of
Persian and Iranian requirements in any piece of software, which is the
reason we are active on the PersianComputing mailing list. We have
already shared an internal document with Omid on what we consider
requirements of Iran's Persian, and we will try to review his final
document and provide comments to him. He has suggested that we even
support his final proposal, which we may decide to do at the end. But we
have lots of other work to do, and we can't take responsibility for
everything, specially any software that doesn't come with source code.

Roozbeh Pournader
Technical Manager of the FarsiWeb Project
President of Sharif FarsiWeb, Inc.

[1] Current sponsors are Sharif University of Technology and Cyber7 Inc.
Previous sponsors included Science and Arts Foundation, and High Council
of Informatics of Iran. FarsiWeb welcomes other sponsors or contractors.


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Farsi Stemming Algorithm

2004-04-29 Thread Ehsan Akhgari
Thanks a lot, Jon, for your reply.

> The only one that I'm aware of is found here [1].  But it
> seems hard to get any other information about this stemmer.

Yes, it definitely seems so.  The only Farsi stemmer I've been aware of
myself is http://www.isri.unlv.edu/publications/isripub/Taghva2003-02.pdf .
I had contacted Dr. Taghva some time ago about his stemmer, but didn't hear
back from him at all.

> While the aim is a little different from a stemmer, a Perian
> morphological engine is being developed.  The one available
> for download [2] is a couple versions behind current
> development, but it still yeilds decent results.  Version 0.5
> is public domain, and newer versions will be under the
> General Public License.  A new version will be released in a
> couple of months.

I downloaded this package, and looked into it.  It seem to be useful for my
job.  However, this is the first time I'm hearing of PC-Kimmo, so I was kind
of lost when trying to figure out the whole thing.  I was wonderring if you
can provide me with some additional info (or URLs; didn't find any myself)
about this software, especially how can it be used on Linux in batch mode.
Does PC-Kimmo come with any callable C interface?

Thanks a lot!
-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing