My Dear Perl-Friends! I tried to get help on this through IRC's #perl channel but all hints I was given didn't help - probably because of lack of explanation, so I try here. I have an over 110Mb corpus written mostly in three languages - Japanese, Chinese and English. I'd like to make it only Japanese corpus.
Let's say that Japanese characters are represented by letters ABC and Chinese by XYZ. My problem lays also in fact that both of these languages don't use spaces, so I decided to concentrate on Japanese period/exclamation/question marks characters (let's say o,p & q). So basic corpus.txt looks like that: ... XYZZYXYXYZYZYX This is a pen. XYZYZ XYZ ZYXYZ This is a cat. ABCCBAoCABACBq XYZYX \n XYZZYABCCBAoXYXYZYZYX This is a pen. XYZYZ XYZ ZYXYZ This is a cat. ABCCBApCABACBo XYZYX \n ... What I need is to have only japanesesentences.txt: ... ABCCBAo CABACBq ABCCBAo ABCCBAp CABACBo ... Telling Perl-san that A-C is Japanese and X-Z is Chinese was much to high for my beginner level, so I tried many times to tell Mr Input_Record_Separator to be $/="." or "!" or "?" foreach (<>), but something must be wrong with my grammar basically looking like that: open(FILEIN, "C:\\corpus.txt"); open(FILEOUT, "C:\\japanesesentences.txt"); $/="o" || "p" || "q"; #three of them written originally in Japanese - other scripts have no problems with it... foreach (<>) { print FILEOUT $_; } close(FILEIN); close(FILEOUT); exit; Because it didn't show a sign of working properly I was so desperated that I decided just to "press enters" after all ".","?" and "!": foreach (<FILEIN>) { $_=~ s/"o" || "p" || "q"/\n/; print FILEOUT $_; } ...but it didn't work either... am I to silly for Powerfully Erotic Randal's Language? Alaca PS: sorry to make it so long but when I explain in two sentences people advise me to use grep :-) PS: Many times when I read "Programming Perl" I have problems with "dry explenations" (meaning without so many examples). Is there any HP showing some little scripts or I have to buy thick "Perl Cookbook"? __________________________________________________ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]