Re: [PHP] Japanese character validation
I know this is becoming off-t but just for the curious... On Fri, 7 Nov 2003 13:43:06 -0600 Eugene Lee [EMAIL PROTECTED] wrote: On Sat, Nov 08, 2003 at 01:35:40AM +0900, - Edwin - wrote: : : On 2003.11.7, at 18:37 Asia/Tokyo, Marek Kilimajer wrote: : : ...[snip]... : : Are Kanji and Kana chracter sets? : : Kan - Chinese + ji - character : : kana: (quoted from the American Heritage Dictionary) : 1. Japanese syllabic writing. The characters are simplified kanji Actually, kana are not simplified kanji because it is not the case that kana can replace kanji while preserving the exact same meaning. In fact, most kana by themselves have no meaning. Well, I'm sure there's a very good reason why the dictionary I quoted called it simplified kanji. In fact, there's a very good why many--if not all--the books that talks about the subject call it the same way. Japanese didn't have a native system of writing so they borrowed from Chinese characters. Those Chinese characters were used *phonetically* and the meanings were ignored. In other words, one can say that, during those time /even/ kanjis did NOT have any meaning for the Japanese (person) since the characters were just used phonetically. Since each Japanese word had to employ several Chinese characters, which requires a large number of strokes, they decided to simplify this bothersome process by using a cursive, simplified style of kanji. Then, (just to make the story short) during the Heian period (794-1185), the simplified characters underwent a further simplification. Thus, hiragana (and a little later, katakana) was born. Actually, just by observing how the kanas are written, you'll notice that: * The hiragana and katakana for na came from the kanji na in Nara (Nara Prefecture). * The hiragana and katakana for yu came from the kanji yu which means reason, cause, etc. * All hiragana and katakana has a corresponding kanji from which they're derived from. Now, back to the future... - E - __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On Fri, 7 Nov 2003 13:36:35 -0600 Eugene Lee [EMAIL PROTECTED] wrote: On Sat, Nov 08, 2003 at 02:20:00AM +0900, - Edwin - wrote: : : Besides, there are some issues (for example with Shift_JIS) that: bothers (with no easy solution) even members of the Japanese PHP: Group ML. (Like the recent thread [PHP-users 18803] on: http://www.php.gr.jp/ or : http://ns1.php.gr.jp/mailman/listinfo/php-users mentioned about the : SJIS trouble.) Force the end-user not to use Shift-JIS. Um, you don't have to do that since YOU as the programmer decides what to use. It's a brain-dead format used only for internal processing purposes and not meant as a for-the-public encoding method. Stick with something nice like normal JIS or Unicode. Brain-dead format compared to JIS? Hehe, maybe you're confused ;) Besides, I guess, more than half of Japanese sites are written in shift_jis. Result of a quick Google search: http://www.io.com/~kazushi/encoding/ Anyway, the easiest way (I find for now) when working with PHP and Databases (MySQL, etc.) is to use euc-jp. There are times though that you are forced to use shift_jis e.g. when working with sites for i-mode's browsers. If that's the case, just use the mb_* functions to convert from euc-jp to shift_jis... - E - __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On Sat, Nov 08, 2003 at 06:26:39PM +0900, - Edwin - wrote: : : On Fri, 7 Nov 2003 13:43:06 -0600 Eugene wrote: : : Actually, kana are not simplified kanji because it is not : the case that kana can replace kanji while preserving the : exact same meaning. In fact, most kana by themselves have : no meaning. : : Well, I'm sure there's a very good reason why the dictionary : I quoted called it simplified kanji. I disagree with the term simplified kanji. The kana may have been derived from kanji and evolved over the centuries, but they are no longer kanji in the sense that they carry any intrinsic meaning by themselves. Nor can they replace kanji in meaning and function. They are just phonetic alphabets. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Japanese character validation
In hopes of bringing the kanji character validation issue back on topic, can I point out that it doesn't matter *why* someone would want to do this, or what the origins of kanji and kana are? The motivations of the original poster shouldn't be in question. Everyone has their own situations and goals, and what's not important to one person is important to others. Either what they are asking for is possible or not, and if it is, it would be enlightening to know how. I for one am also very interested in hearing possible solutions. I can think of multiple situations in which checking to see whether a user inputted kanji or kana would be very useful indeed. And I hope to learn more by further discussion of the PHP coding required. It would be a shame if that potential learning was obscured or lost in off topic theorizing about the origins of the Japanese language. Optimistically looking forward to seeing more technical discussion on how to accomplish this. -- Cheers! Dave G [EMAIL PROTECTED] -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On 2003.11.8, at 20:32 Asia/Tokyo, Eugene Lee wrote: On Sat, Nov 08, 2003 at 06:26:39PM +0900, - Edwin - wrote: : : On Fri, 7 Nov 2003 13:43:06 -0600 Eugene wrote: : : Actually, kana are not simplified kanji because it is not : the case that kana can replace kanji while preserving the : exact same meaning. In fact, most kana by themselves have : no meaning. : : Well, I'm sure there's a very good reason why the dictionary : I quoted called it simplified kanji. I disagree with the term simplified kanji. regex - regular expressions Um, what's so regular about it again? The kana may have been derived from kanji and evolved over the centuries, but they are no longer kanji in the sense that they carry any intrinsic meaning by themselves. ?? Who said that they are kanji? Kanji are Chinese characters whereas kana are Japanese characters. Nor can they replace kanji in meaning and function. They are just phonetic alphabets. Did I say otherwise? - E - PS Maybe, you can complain here: http://dictionary.reference.com/search?q=kana __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On 2003.11.8, at 21:51 Asia/Tokyo, Dave G wrote: In hopes of bringing the kanji character validation issue back on topic, can I point out that it doesn't matter *why* someone would want to do this, or what the origins of kanji and kana are? The motivations of the original poster shouldn't be in question. Everyone has their own situations and goals, and what's not important to one person is important to others. Either what they are asking for is possible or not, and if it is, it would be enlightening to know how. I think it'd be scary if there's a certain doctor who'd give you a medicine just because you said you have a headache. I'm sure a good doctor would ask questions to diagnose what may be causing it. He might even send you home without giving you any medicine. Don't be surprised if your headache is gone the next morning. I for one am also very interested in hearing possible solutions. I can think of multiple situations in which checking to see whether a user inputted kanji or kana would be very useful indeed. Like for example? And I hope to learn more by further discussion of the PHP coding required. It would be a shame if that potential learning was obscured or lost in off topic theorizing about the origins of the Japanese language. First, I'm not theorizing. They're written in history books. Optimistically looking forward to seeing more technical discussion on how to accomplish this. Secondly, didn't I mention that this was recently brought up in the Japanese PHP Group ML? And there's really NO easy solution for this? http://ns1.php.gr.jp/pipermail/php-users/2003-October/019236.html In fact, that problem is already an FAQ in that ML--and other MLs as well since the need is not limited to PHP programmers only. - E - __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On Sat, Nov 08, 2003 at 11:20:27PM +0900, - Edwin - wrote: : : On 2003.11.8, at 20:32 Asia/Tokyo, Eugene Lee wrote: : : On Sat, Nov 08, 2003 at 06:26:39PM +0900, - Edwin - wrote: : : : : Well, I'm sure there's a very good reason why the dictionary : : I quoted called it simplified kanji. : : I disagree with the term simplified kanji. : [...] : : The kana may have been : derived from kanji and evolved over the centuries, but they are no : longer kanji in the sense that they carry any intrinsic meaning by : themselves. : : ?? Who said that they are kanji? Edwin, you quoted the American Heritage Dictionary: kana: (quoted from the American Heritage Dictionary) 1. Japanese syllabic writing. The characters are simplified kanji http://www.phparch.com/mailinglists/msg.php?a=729121s=japanese+characterp=g= I am simply explaning that part of the dictionary definition is incorrect. The statement above, The characters are simplified kanji, is equal to the statement, kana are simplified kanji. The ISA relationship between kana and kanji is false and does not exist. However, I do agree without question that kana evolved from kanji. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
Hi, On Fri, 7 Nov 2003 12:58:51 +0530 umesh [EMAIL PROTECTED] wrote: Hi Gurus, I am new to PHP. I am using PHP4 on Linux. I have accept input from the user and check if the input is japanese character only, for example : If name is accepted , I need to check if its any of the Hiragana, Katakana or Kanji. Hmm... why would you like to do that? I've never really seen the need for that. I'm sure there will be a lot of issues if you want to write one yourself... I have enabled multibyte support while compiling PHP. As there is jcode.pl in perl, I want to know if there is something in PHP. Can anybody help me in this regard. That's a library for character code conversion. PHP can do that as well: http://www.php.net/mbstring - E - __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Japanese character validation
Hi Edwin, Original Message- From: - Edwin - [mailto:[EMAIL PROTECTED] Sent: Friday, November 07, 2003 2:02 PM To: umesh Cc: [EMAIL PROTECTED] Subject: Re: [PHP] Japanese character validation Hi, On Fri, 7 Nov 2003 12:58:51 +0530 umesh [EMAIL PROTECTED] wrote: Hi Gurus, I am new to PHP. I am using PHP4 on Linux. I have accept input from the user and check if the input is japanese character only, for example : If name is accepted , I need to check if its any of the Hiragana, Katakana or Kanji. Hmm... why would you like to do that? I've never really seen the need for that. As there are fields called Last First Name (Kanji) and Last First name (Kana) on my forms. Its the need of the application. It would be great, If I could do that. I'm sure there will be a lot of issues if you want to write one yourself... Actually, I have tried to do it, but it wont understand some characters. Like . I have enabled multibyte support while compiling PHP. As there is jcode.pl in perl, I want to know if there is something in PHP. Can anybody help me in this regard. That's a library for character code conversion. PHP can do that as well: http://www.php.net/mbstring - E - __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/
Re: [PHP] Japanese character validation
umesh wrote: As there are fields called Last First Name (Kanji) and Last First name (Kana) on my forms. Its the need of the application. It would be great, If I could do that. Are Kanji and Kana chracter sets? Form cannot use more charsets. Charset that your script receives will be the same as the charset the page uses. Marek -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Japanese character validation
No, Kanji Kana are not charsets. The form is having charset EUC-JP. Umesh. -Original Message- From: Marek Kilimajer [mailto:[EMAIL PROTECTED] Sent: Friday, November 07, 2003 3:08 PM To: umesh Cc: [EMAIL PROTECTED] Subject: Re: [PHP] Japanese character validation umesh wrote: As there are fields called Last First Name (Kanji) and Last First name (Kana) on my forms. Its the need of the application. It would be great, If I could do that. Are Kanji and Kana chracter sets? Form cannot use more charsets. Charset that your script receives will be the same as the charset the page uses. Marek -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
Howdy, I have accept input from the user and check if the input is japanese character only, for example : If name is accepted , I need to check if its any of the Hiragana, Katakana or Kanji. Hmm... why would you like to do that? I've never really seen the need for that. It's actually quite common for forms here in Japan (both online and paper) where the user is requested to give the phonetic reading in Hiragana or Katakana for their Kanji name. (Especially the first name where there can be a lot of latitude on the reading even if the Kanji used has other more conventional readings. Likewise for some uncommon place names.) Online, it can also be handy where you want to disallow the use of and catch and filter numbers or symbols that may also have double-byte equivalents, and instead use only the Hankaku ASCII forms, though these can usually be validated more directly by an A - Z, 0 - 9, etc. regular expression. We currently do this to validate our form inputs on the client side using Javascript functions before accepting the inputs serverside. When I have time, I'll look into what's possible with the mb_ereg/mb_regex functions. Umesh, if you'd like a copy of our Javascript functions, please contact me offlist. Hope this helps, Lew -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
Hi, (B (BOn 2003.11.7, at 18:20 Asia/Tokyo, umesh wrote: (B (B...[snip]... (B (B As there are fields called Last First Name (Kanji) and Last First (B name (Kana) on my forms. Its the need of the application. (B It would be great, If I could do that. (B (BLet's say you can do that. What happens if my name doesn't have a (BKanji? You'll just reject it? (Since you're validating that certain (Bfield *should* just be Kanji.) (B (B Actually, I have tried to do it, but it wont understand some (B characters. Like $B)!(B. (B (BWell, that's "one of the issues" I was trying to tell you... (B (B- E - (B (B__ (BDo You Yahoo!? (BYahoo! BB is Broadband by Yahoo! (Bhttp://bb.yahoo.co.jp/ (B (B-- (BPHP General Mailing List (http://www.php.net/) (BTo unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On 2003.11.7, at 18:37 Asia/Tokyo, Marek Kilimajer wrote: ...[snip]... Are Kanji and Kana chracter sets? Kan - Chinese + ji - character kana: (quoted from the American Heritage Dictionary) 1. Japanese syllabic writing. The characters are simplified kanji and are usually used with kanji primarily to write inflections, particles, and function words and to show the pronunciations of some kanji and of all foreign words. So, no, they're not the character sets *we* usually refer to (i.e. euc-jp, shift_jis, etc.) -- - E - __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
Hi, On 2003.11.8, at 00:42 Asia/Tokyo, Lew Mark-Andrews wrote: Howdy, I have accept input from the user and check if the input is japanese character only, for example : If name is accepted , I need to check if its any of the Hiragana, Katakana or Kanji. Hmm... why would you like to do that? I've never really seen the need for that. It's actually quite common for forms here in Japan (both online and paper) where the user is requested to give the phonetic reading in Hiragana or Katakana for their Kanji name. (Especially the first name where there can be a lot of latitude on the reading even if the Kanji used has other more conventional readings. Likewise for some uncommon place names.) True. But... Online, it can also be handy where you want to disallow the use of and catch and filter numbers or symbols that may also have double-byte equivalents, and instead use only the Hankaku ASCII forms, though these can usually be validated more directly by an A - Z, 0 - 9, etc. regular expression. Well, just like you said, Hankaku can easily be validated. But, how and WHY would you want to validate if a certain field is Kanji, katakana, or hiragana? And limit certain fields to each of those? Think what would happen if you only accept only Kanji in a certain field. Here are some issues you'd face: 1. Not all Japanese have Kanjis in their names. 2. Not all who can read Japanese have Japanese names. ;) We currently do this to validate our form inputs on the client side using Javascript functions before accepting the inputs serverside. When I have time, I'll look into what's possible with the mb_ereg/mb_regex functions. Umesh, if you'd like a copy of our Javascript functions, please contact me offlist. Hmm... interesting but whether you validate with Javascript and/or PHP, you'll face the same problem I mentioned above. Besides, there are some issues (for example with Shift_JIS) that bothers (with no easy solution) even members of the Japanese PHP Group ML. (Like the recent thread [PHP-users 18803] on http://www.php.gr.jp/ or http://ns1.php.gr.jp/mailman/listinfo/php-users mentioned about the SJIS trouble.) There was actually a similar topic on validation (in the Japanese ML) but there is no easy solution esp. about those platform (read Windows) independent characters. This was also mentioned by the OP (numbers inside circles character problem). Well, I'm satisfied (for now) with the mb_ functions... - E - __ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On Sat, Nov 08, 2003 at 02:20:00AM +0900, - Edwin - wrote: : : Besides, there are some issues (for example with Shift_JIS) that : bothers (with no easy solution) even members of the Japanese PHP : Group ML. (Like the recent thread [PHP-users 18803] on : http://www.php.gr.jp/ or : http://ns1.php.gr.jp/mailman/listinfo/php-users mentioned about the : SJIS trouble.) Force the end-user not to use Shift-JIS. It's a brain-dead format used only for internal processing purposes and not meant as a for-the-public encoding method. Stick with something nice like normal JIS or Unicode. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Japanese character validation
On Sat, Nov 08, 2003 at 01:35:40AM +0900, - Edwin - wrote: : : On 2003.11.7, at 18:37 Asia/Tokyo, Marek Kilimajer wrote: : : ...[snip]... : : Are Kanji and Kana chracter sets? : : Kan - Chinese + ji - character : : kana: (quoted from the American Heritage Dictionary) : 1. Japanese syllabic writing. The characters are simplified kanji Actually, kana are not simplified kanji because it is not the case that kana can replace kanji while preserving the exact same meaning. In fact, most kana by themselves have no meaning. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php