On Sun, Aug 19, 2001 at 04:49:30AM +0900, Pakhun Alhaca wrote:
> open(FILEIN, "C:\\corpus.txt"); 
> open(FILEOUT, "C:\\japanesesentences.txt");
> $/="o" || "p" || "q";  #three of them written originally

This assigns "q" to $/.  For one, you can't multiple things to split on in
$/, for another your syntax is incorrect; the syntax '"o" || "p" || "q"
evaluates to the last true value in the sequence, which is "q".


> in Japanese - other scripts have no problems with it...
> foreach (<>) {  print FILEOUT $_; } 
> close(FILEIN); close(FILEOUT); exit;
> 
> foreach (<FILEIN>) { 
> $_=~ s/"o" || "p" || "q"/\n/;

You almost have your solution here, but you're using the wrong syntax; what
you were trying for is s/o|p|q/\n/.  However, with that you lose your
punctuation, so you should probably use something along the lines of
s/([opq])/$1\n/g.

Unfortunately, that doesn't solve your problem, it's just a method for
testing.  If you want every third element, based on the punctuation split,
you should be using:

    $japanese = (split(/[opq]/, $_))[2];

If I understand you correctly.


> print FILEOUT $_;
> } 


Judging from your question and your attempted solutions you don't have, or
haven't read, decent basic learning material on Perl.  You should consider
purchasing and/or reading _Beginning Perl_ by Simon Cozens, _Learning Perl_
by Randal Schwartz, and possibly _Mastering Regular Expressions_ by Jeffrey
Friedl.

At the very least, consult perldoc perlre, perldoc perlop, and perldoc -f
split for documentation on regexes, s///, and split, respectively.


Michael
--
Administrator                      www.shoebox.net
Programmer, System Administrator   www.gallanttech.com
--

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to