Re: [PHP] Japanese character validation

2003-11-08 Thread - Edwin -
I know this is becoming off-t but just for the curious...

On Fri, 7 Nov 2003 13:43:06 -0600
Eugene Lee [EMAIL PROTECTED] wrote:

 On Sat, Nov 08, 2003 at 01:35:40AM +0900, - Edwin - wrote:
 : 
 : On 2003.11.7, at 18:37 Asia/Tokyo, Marek Kilimajer wrote:
 : 
 : ...[snip]...
 : 
 : Are Kanji and Kana chracter sets?
 : 
 :   Kan - Chinese + ji - character
 : 
 :   kana:  (quoted from the American Heritage Dictionary)
 : 1. Japanese syllabic writing. The characters are
 simplified kanji
 
 Actually, kana are not simplified kanji because it is not
 the case that kana can replace kanji while preserving the
 exact same meaning. In fact, most kana by themselves have
 no meaning.

Well, I'm sure there's a very good reason why the dictionary
I quoted called it simplified kanji. In fact, there's a
very good why many--if not all--the books that talks about
the subject call it the same way.

Japanese didn't have a native system of writing so they
borrowed from Chinese characters. Those Chinese characters
were used *phonetically* and the meanings were ignored. In
other words, one can say that, during those time /even/
kanjis did NOT have any meaning for the Japanese (person)
since the characters were just used phonetically.

Since each Japanese word had to employ several Chinese
characters, which requires a large number of strokes, they
decided to simplify this bothersome process by using a
cursive, simplified style of kanji.

Then, (just to make the story short) during the Heian period
(794-1185), the simplified characters underwent a further
simplification. Thus, hiragana (and a little later,
katakana) was born.

Actually, just by observing how the kanas are written,
you'll notice that:

  * The hiragana and katakana for na
came from the kanji na in Nara
(Nara Prefecture).
  * The hiragana and katakana for yu
came from the kanji yu which means
reason, cause, etc.
  * All hiragana and katakana has a
corresponding kanji from which
they're derived from.

Now, back to the future...

- E -
__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Japanese character validation

2003-11-08 Thread - Edwin -
On Fri, 7 Nov 2003 13:36:35 -0600
Eugene Lee [EMAIL PROTECTED] wrote:

 On Sat, Nov 08, 2003 at 02:20:00AM +0900, - Edwin - wrote:
 : 
 : Besides, there are some issues (for example with
 Shift_JIS) that: bothers (with no easy solution) even
 members of the Japanese PHP: Group ML.  (Like the recent
 thread [PHP-users 18803] on: http://www.php.gr.jp/ or
 : http://ns1.php.gr.jp/mailman/listinfo/php-users mentioned
 about the : SJIS trouble.)
 
 Force the end-user not to use Shift-JIS.  

Um, you don't have to do that since YOU as the programmer
decides what to use.

 It's a brain-dead
 format used only for internal processing purposes and not
 meant as a for-the-public encoding method.  Stick with
 something nice like normal JIS or Unicode.

Brain-dead format compared to JIS? Hehe, maybe you're
confused ;) Besides, I guess, more than half of Japanese
sites are written in shift_jis.

Result of a quick Google search:

  http://www.io.com/~kazushi/encoding/

Anyway, the easiest way (I find for now) when working with
PHP and Databases (MySQL, etc.) is to use euc-jp. There are
times though that you are forced to use shift_jis e.g.
when working with sites for i-mode's browsers. If that's the
case, just use the mb_* functions to convert from euc-jp to
shift_jis...

- E -
__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Japanese character validation

2003-11-08 Thread Eugene Lee
On Sat, Nov 08, 2003 at 06:26:39PM +0900, - Edwin - wrote:
: 
: On Fri, 7 Nov 2003 13:43:06 -0600 Eugene wrote:
:  
:  Actually, kana are not simplified kanji because it is not
:  the case that kana can replace kanji while preserving the
:  exact same meaning. In fact, most kana by themselves have
:  no meaning.
: 
: Well, I'm sure there's a very good reason why the dictionary
: I quoted called it simplified kanji.

I disagree with the term simplified kanji.  The kana may have been
derived from kanji and evolved over the centuries, but they are no
longer kanji in the sense that they carry any intrinsic meaning by
themselves.  Nor can they replace kanji in meaning and function.  They
are just phonetic alphabets.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Japanese character validation

2003-11-08 Thread Dave G

In hopes of bringing the kanji character validation issue back
on topic, can I point out that it doesn't matter *why* someone would
want to do this, or what the origins of kanji and kana are? The
motivations of the original poster shouldn't be in question. Everyone
has their own situations and goals, and what's not important to one
person is important to others. Either what they are asking for is
possible or not, and if it is, it would be enlightening to know how.
I for one am also very interested in hearing possible solutions.
I can think of multiple situations in which checking to see whether a
user inputted kanji or kana would be very useful indeed. And I hope to
learn more by further discussion of the PHP coding required. It would be
a shame if that potential learning was obscured or lost in off topic
theorizing about the origins of the Japanese language.

Optimistically looking forward to seeing more technical discussion on
how to accomplish this.

-- 
Cheers!
Dave G
[EMAIL PROTECTED]

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Japanese character validation

2003-11-08 Thread - Edwin -
On 2003.11.8, at 20:32 Asia/Tokyo, Eugene Lee wrote:

On Sat, Nov 08, 2003 at 06:26:39PM +0900, - Edwin - wrote:
:
: On Fri, 7 Nov 2003 13:43:06 -0600 Eugene wrote:
: 
:  Actually, kana are not simplified kanji because it is not
:  the case that kana can replace kanji while preserving the
:  exact same meaning. In fact, most kana by themselves have
:  no meaning.
:
: Well, I'm sure there's a very good reason why the dictionary
: I quoted called it simplified kanji.
I disagree with the term simplified kanji.
  regex - regular expressions

  Um, what's so regular about it again?

  The kana may have been
derived from kanji and evolved over the centuries, but they are no
longer kanji in the sense that they carry any intrinsic meaning by
themselves.
?? Who said that they are kanji?

Kanji are Chinese characters whereas kana are Japanese characters.

  Nor can they replace kanji in meaning and function.  They
are just phonetic alphabets.
Did I say otherwise?

- E -

PS
Maybe, you can complain here:
  http://dictionary.reference.com/search?q=kana

__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Japanese character validation

2003-11-08 Thread - Edwin -
On 2003.11.8, at 21:51 Asia/Tokyo, Dave G wrote:

In hopes of bringing the kanji character validation issue back
on topic, can I point out that it doesn't matter *why* someone would
want to do this, or what the origins of kanji and kana are? The
motivations of the original poster shouldn't be in question. Everyone
has their own situations and goals, and what's not important to one
person is important to others. Either what they are asking for is
possible or not, and if it is, it would be enlightening to know how.
I think it'd be scary if there's a certain doctor who'd give you a
medicine just because you said you have a headache. I'm sure a
good doctor would ask questions to diagnose what may be causing
it. He might even send you home without giving you any medicine.
Don't be surprised if your headache is gone the next morning.
I for one am also very interested in hearing possible solutions.
I can think of multiple situations in which checking to see whether a
user inputted kanji or kana would be very useful indeed.
Like for example?

 And I hope to
learn more by further discussion of the PHP coding required. It would 
be
a shame if that potential learning was obscured or lost in off topic
theorizing about the origins of the Japanese language.
First, I'm not theorizing. They're written in history books.

Optimistically looking forward to seeing more technical discussion on
how to accomplish this.
Secondly, didn't I mention that this was recently brought up in the
Japanese PHP Group ML? And there's really NO easy solution for this?
  http://ns1.php.gr.jp/pipermail/php-users/2003-October/019236.html

In fact, that problem is already an FAQ in that ML--and other MLs as
well since the need is not limited to PHP programmers only.
- E -

__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Japanese character validation

2003-11-08 Thread Eugene Lee
On Sat, Nov 08, 2003 at 11:20:27PM +0900, - Edwin - wrote:
: 
: On 2003.11.8, at 20:32 Asia/Tokyo, Eugene Lee wrote:
: 
: On Sat, Nov 08, 2003 at 06:26:39PM +0900, - Edwin - wrote:
: :
: : Well, I'm sure there's a very good reason why the dictionary
: : I quoted called it simplified kanji.
: 
: I disagree with the term simplified kanji.
: 
[...]
: 
:   The kana may have been
: derived from kanji and evolved over the centuries, but they are no
: longer kanji in the sense that they carry any intrinsic meaning by
: themselves.
: 
: ?? Who said that they are kanji?

Edwin, you quoted the American Heritage Dictionary:

 kana:  (quoted from the American Heritage Dictionary)
 1. Japanese syllabic writing. The characters are simplified kanji

http://www.phparch.com/mailinglists/msg.php?a=729121s=japanese+characterp=g=

I am simply explaning that part of the dictionary definition is incorrect.
The statement above, The characters are simplified kanji, is equal to
the statement, kana are simplified kanji.  The ISA relationship between
kana and kanji is false and does not exist.

However, I do agree without question that kana evolved from kanji.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] Japanese character validation

2003-11-07 Thread umesh
Hi Gurus,
(B
(BI am new to PHP. I am using PHP4 on Linux.
(B
(BI have accept input from the user and check if the input is japanese
(Bcharacter only,
(Bfor example : If name is accepted , I need to check if its any of the
(BHiragana, Katakana or Kanji.
(B
(BI have enabled multibyte support while compiling PHP.
(B
(BAs there is jcode.pl in perl, I want to know if there is something in PHP.
(BCan anybody help me in this regard.
(B
(BThanking you in anticipation.
(B
(BRegards,
(B
(BUmesh.

Re: [PHP] Japanese character validation

2003-11-07 Thread - Edwin -
Hi,

On Fri, 7 Nov 2003 12:58:51 +0530
umesh [EMAIL PROTECTED] wrote:

 Hi Gurus,
 
 I am new to PHP. I am using PHP4 on Linux.
 
 I have accept input from the user and check if the input is
 japanese character only,
 for example : If name is accepted , I need to check if its
 any of the Hiragana, Katakana or Kanji.

Hmm... why would you like to do that? I've never really seen
the need for that.

I'm sure there will be a lot of issues if you want to write
one yourself...

 I have enabled multibyte support while compiling PHP.
 
 As there is jcode.pl in perl, I want to know if there is
 something in PHP. Can anybody help me in this regard.

That's a library for character code conversion.

PHP can do that as well:

  http://www.php.net/mbstring

- E -
__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Japanese character validation

2003-11-07 Thread umesh
Hi Edwin,

Original Message-
From: - Edwin - [mailto:[EMAIL PROTECTED]
Sent: Friday, November 07, 2003 2:02 PM
To: umesh
Cc: [EMAIL PROTECTED]
Subject: Re: [PHP] Japanese character validation

Hi,

On Fri, 7 Nov 2003 12:58:51 +0530
umesh [EMAIL PROTECTED] wrote:

 Hi Gurus,

 I am new to PHP. I am using PHP4 on Linux.

 I have accept input from the user and check if the input is
 japanese character only,
 for example : If name is accepted , I need to check if its
 any of the Hiragana, Katakana or Kanji.

Hmm... why would you like to do that? I've never really seen
the need for that.

As there are fields called  Last  First Name (Kanji) and Last  First name (Kana) on 
my forms. Its the need of the application.
It would be great, If I could do that.

I'm sure there will be a lot of issues if you want to write
one yourself...
Actually, I have tried to do it, but it wont understand some characters. Like .

 I have enabled multibyte support while compiling PHP.

 As there is jcode.pl in perl, I want to know if there is
 something in PHP. Can anybody help me in this regard.

That's a library for character code conversion.

PHP can do that as well:

  http://www.php.net/mbstring

- E -
__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/




Re: [PHP] Japanese character validation

2003-11-07 Thread Marek Kilimajer
umesh wrote:
As there are fields called  Last  First Name (Kanji) and Last  First name (Kana) on 
my forms. Its the need of the application.
It would be great, If I could do that.
Are Kanji and Kana chracter sets? Form cannot use more charsets. Charset 
 that your script receives will be the same as the charset the page uses.

Marek

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


RE: [PHP] Japanese character validation

2003-11-07 Thread umesh
No, Kanji  Kana are not charsets.
The form is having charset EUC-JP.

Umesh.

-Original Message-
From: Marek Kilimajer [mailto:[EMAIL PROTECTED]
Sent: Friday, November 07, 2003 3:08 PM
To: umesh
Cc: [EMAIL PROTECTED]
Subject: Re: [PHP] Japanese character validation


umesh wrote:
 As there are fields called  Last  First Name (Kanji) and Last  First
name (Kana) on my forms. Its the need of the application.
 It would be great, If I could do that.


Are Kanji and Kana chracter sets? Form cannot use more charsets. Charset
  that your script receives will be the same as the charset the page uses.

Marek

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Japanese character validation

2003-11-07 Thread Lew Mark-Andrews
Howdy,

 I have accept input from the user and check if the input is
 japanese character only,
 for example : If name is accepted , I need to check if its
 any of the Hiragana, Katakana or Kanji.

Hmm... why would you like to do that? I've never really seen
the need for that.

It's actually quite common for forms here in Japan (both online and paper)
where the user is requested to give the phonetic reading in Hiragana or
Katakana for their Kanji name. (Especially the first name where there can
be a lot of latitude on the reading even if the Kanji used has other more
conventional readings. Likewise for some uncommon place names.) Online, it
can also be handy where you want to disallow the use of and catch and
filter numbers or symbols that may also have double-byte equivalents, and
instead use only the Hankaku ASCII forms, though these can usually be
validated more directly by an A - Z, 0 - 9, etc. regular expression.

We currently do this to validate our form inputs on the client side using
Javascript functions before accepting the inputs serverside. When I have
time, I'll look into what's possible with the mb_ereg/mb_regex functions.
Umesh, if you'd like a copy of our Javascript functions, please contact me
offlist.

Hope this helps,
Lew

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Japanese character validation

2003-11-07 Thread - Edwin -
Hi,
(B
(BOn 2003.11.7, at 18:20 Asia/Tokyo, umesh wrote:
(B
(B...[snip]...
(B
(B As there are fields called  Last  First Name (Kanji) and Last  First 
(B name (Kana) on my forms. Its the need of the application.
(B It would be great, If I could do that.
(B
(BLet's say you can do that. What happens if my name doesn't have a 
(BKanji? You'll just reject it? (Since you're validating that certain 
(Bfield *should* just be Kanji.)
(B
(B Actually, I have tried to do it, but it wont understand some 
(B characters. Like $B)!(B.
(B
(BWell, that's "one of the issues" I was trying to tell you...
(B
(B- E -
(B
(B__
(BDo You Yahoo!?
(BYahoo! BB is Broadband by Yahoo!
(Bhttp://bb.yahoo.co.jp/
(B
(B-- 
(BPHP General Mailing List (http://www.php.net/)
(BTo unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] Japanese character validation

2003-11-07 Thread - Edwin -
On 2003.11.7, at 18:37 Asia/Tokyo, Marek Kilimajer wrote:

...[snip]...

Are Kanji and Kana chracter sets?
  Kan - Chinese + ji - character

  kana:  (quoted from the American Heritage Dictionary)
1. Japanese syllabic writing. The characters are simplified kanji
and are usually used with kanji primarily to write inflections,
particles, and function words and to show the pronunciations
of some kanji and of all foreign words.
So, no, they're not the character sets *we* usually refer to
(i.e. euc-jp, shift_jis, etc.)
--

- E -

__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Japanese character validation

2003-11-07 Thread - Edwin -
Hi,

On 2003.11.8, at 00:42 Asia/Tokyo, Lew Mark-Andrews wrote:

Howdy,

I have accept input from the user and check if the input is
japanese character only,
for example : If name is accepted , I need to check if its
any of the Hiragana, Katakana or Kanji.
Hmm... why would you like to do that? I've never really seen
the need for that.
It's actually quite common for forms here in Japan (both online and 
paper)
where the user is requested to give the phonetic reading in Hiragana or
Katakana for their Kanji name. (Especially the first name where there 
can
be a lot of latitude on the reading even if the Kanji used has other 
more
conventional readings. Likewise for some uncommon place names.)
True. But...

 Online, it
can also be handy where you want to disallow the use of and catch and
filter numbers or symbols that may also have double-byte equivalents, 
and
instead use only the Hankaku ASCII forms, though these can usually be
validated more directly by an A - Z, 0 - 9, etc. regular expression.
Well, just like you said, Hankaku can easily be validated. But, how and
WHY would you want to validate if a certain field is Kanji, katakana, or
hiragana? And limit certain fields to each of those?
Think what would happen if you only accept only Kanji in a certain 
field.
Here are some issues you'd face:

  1. Not all Japanese have Kanjis in their names.
  2. Not all who can read Japanese have Japanese names. ;)
We currently do this to validate our form inputs on the client side 
using
Javascript functions before accepting the inputs serverside. When I 
have
time, I'll look into what's possible with the mb_ereg/mb_regex 
functions.
Umesh, if you'd like a copy of our Javascript functions, please 
contact me
offlist.
Hmm... interesting but whether you validate with Javascript and/or PHP,
you'll face the same problem I mentioned above.
Besides, there are some issues (for example with Shift_JIS) that 
bothers
(with no easy solution) even members of the Japanese PHP Group ML.
(Like the recent thread [PHP-users 18803] on http://www.php.gr.jp/ or
  http://ns1.php.gr.jp/mailman/listinfo/php-users mentioned about the 
SJIS
  trouble.)

There was actually a similar topic on validation (in the Japanese ML) 
but
there is no easy solution esp. about those platform (read Windows)
independent characters. This was also mentioned by the OP (numbers
inside circles character problem).

Well, I'm satisfied (for now) with the mb_ functions...

- E -

__
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!
http://bb.yahoo.co.jp/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Japanese character validation

2003-11-07 Thread Eugene Lee
On Sat, Nov 08, 2003 at 02:20:00AM +0900, - Edwin - wrote:
: 
: Besides, there are some issues (for example with Shift_JIS) that
: bothers (with no easy solution) even members of the Japanese PHP
: Group ML.  (Like the recent thread [PHP-users 18803] on
: http://www.php.gr.jp/ or
: http://ns1.php.gr.jp/mailman/listinfo/php-users mentioned about the 
: SJIS trouble.)

Force the end-user not to use Shift-JIS.  It's a brain-dead format used
only for internal processing purposes and not meant as a for-the-public
encoding method.  Stick with something nice like normal JIS or Unicode.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Japanese character validation

2003-11-07 Thread Eugene Lee
On Sat, Nov 08, 2003 at 01:35:40AM +0900, - Edwin - wrote:
: 
: On 2003.11.7, at 18:37 Asia/Tokyo, Marek Kilimajer wrote:
: 
: ...[snip]...
: 
: Are Kanji and Kana chracter sets?
: 
:   Kan - Chinese + ji - character
: 
:   kana:  (quoted from the American Heritage Dictionary)
: 1. Japanese syllabic writing. The characters are simplified kanji

Actually, kana are not simplified kanji because it is not the case
that kana can replace kanji while preserving the exact same meaning.
In fact, most kana by themselves have no meaning.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php