On Feb 3, 8:36 pm, [EMAIL PROTECTED] (Chas. Owens) wrote: > On Feb 3, 2008 12:07 PM, <[EMAIL PROTECTED]> wrote: > > > On Feb 2, 11:10 pm, [EMAIL PROTECTED] (John W. Krahn) wrote: > > > [EMAIL PROTECTED] wrote: > > > > I have a program with a line like > > > > > while (<FILE>) { > > > > if (/stuff/i) { > > > > print; > > > > } > > > > } > > > > > When I run the program, and I replace "stuff" with only one character, > > > > like "d", it works exactly as I expect. But if instead of using "d", I > > > > use "da" or "date" (which I know are in FILE, because it's a text file > > > > I made) nothing prints on the screen. I've also tried to have it print > > > > to another file, and that's turned out blank too. > > > > > What am I doing wrong? > > > > My guess would be that you are creating a UTF-16LE text file on Windows > > > and trying to read it on your Mac? > > > > John > > > -- > > > Perl isn't a toolbox, but a small machine shop where you > > > can special-order certain sorts of tools at low cost and > > > in short order. -- Larry Wall > > > John, > > > I am using a Mac. I don't know what kind of Text file it is, because > > it's being created by Automator as a scrape of a website. If the > > encoding is the problem, how do I work that? > > > Yes, I realize that the problem I'm describing is really weird and the > > code I posted should work perfectly. So it's probably not my coding > > but something to do with the system I'm on (OS X 10.4.11) or the file > > I'm working with (a .txt). > > snip > > Eextensions don't matter in UNIX (and OS X is a UNIX). Step one to > determining what a file contains is running the file command against > it like this: > > file foo.txt > > This will give you a better idea what you are working with. The next > step is looking at header of the HTML, it will probably tell you > exactly what encoding is being used. It should look something like > this: > > <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> > > In that case the file is encoded with UTF-8.
Here's what I got from file file.txt file.txt: Big-endian UTF-16 Unicode English character data, with very long lines, with CRLF, CR, LF line terminators Does this explain why my regexp search wasn't working? -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/