On Sun, Aug 19, 2001 at 04:49:30AM +0900, Pakhun Alhaca wrote:
> open(FILEIN, "C:\\corpus.txt");
> open(FILEOUT, "C:\\japanesesentences.txt");
> $/="o" || "p" || "q"; #three of them written originally
This assigns "q" to $/. For one, you can't multiple things to split on in
$/, for another your syntax is incorrect; the syntax '"o" || "p" || "q"
evaluates to the last true value in the sequence, which is "q".
> in Japanese - other scripts have no problems with it...
> foreach (<>) { print FILEOUT $_; }
> close(FILEIN); close(FILEOUT); exit;
>
> foreach (<FILEIN>) {
> $_=~ s/"o" || "p" || "q"/\n/;
You almost have your solution here, but you're using the wrong syntax; what
you were trying for is s/o|p|q/\n/. However, with that you lose your
punctuation, so you should probably use something along the lines of
s/([opq])/$1\n/g.
Unfortunately, that doesn't solve your problem, it's just a method for
testing. If you want every third element, based on the punctuation split,
you should be using:
$japanese = (split(/[opq]/, $_))[2];
If I understand you correctly.
> print FILEOUT $_;
> }
Judging from your question and your attempted solutions you don't have, or
haven't read, decent basic learning material on Perl. You should consider
purchasing and/or reading _Beginning Perl_ by Simon Cozens, _Learning Perl_
by Randal Schwartz, and possibly _Mastering Regular Expressions_ by Jeffrey
Friedl.
At the very least, consult perldoc perlre, perldoc perlop, and perldoc -f
split for documentation on regexes, s///, and split, respectively.
Michael
--
Administrator www.shoebox.net
Programmer, System Administrator www.gallanttech.com
--
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]