e-letter wrote: > Bob Proulx wrote: > > Sounds like a homework assignment. > > Well, my own assignment in my own home (managing files)!
Okay then. You can imagine that a lot of students try to short circuit things! > > They can't be. It doesn't make sense for comm. Perhaps you should be > > using 'grep' or 'sed'? > > Had a brief look and the manual for 'bash' which makes reference to > grep, but so far awk and perl and supposed to be options. Too many > choices! Perhaps if you describe your problem in more detail we on the mailing list could help better? So far you have said: > File 1 contains data: > /some/text/abcd.xyz > > File 2: > abcd.xyz The first contains fully qualified paths to files and the second one contains only basenames. Okay. > The task is to be able to compare file1 to file2 using a regular > expression as a criterion for comparison, such as: > > *cd.xyz That looks like a file glob. It is called globbing because the '*' matches a glob of characters. You can get more documentation on that style of pattern with 'man 7 glob'. Most importantly globbing is not a regular expression. They have different syntaxes from each other. The command line uses file globbing. This would match the file in the current directory from the command line. $ ls *cd.xyz > Then create a new file 'file3' that contains only those lines that > satisfy the regular expression, but must contain the same format style > as in file1. I suggested this: $ grep -F cd.xyz file1 > file3 In this above I suggested -F to turn off regular expressions and to use the string literally. Otherwise the '.' matches any character and should be escaped. Escaping the dot would be "cd\.xyz". However that may be difficult if you have a file of partial filenames. You would need to preprocess it first. Easier if you can take them as literal strings. Or perhaps: $ grep -F -f file2 file1 > file3 Wasn't sure what you were asking for. In the above it uses the file2 as a list of strings to select from file1. The -f takes a file argument. They will be full regular expressions unless -F is given too. It seems like that will give you exactly what you want. > Had a brief look and the manual for 'bash' which makes reference to > grep, but so far awk and perl and supposed to be options. Too many > choices! Bash is a command line shell that also has features that make it good for use in controling other programs. It is an expansion from the /bin/sh shell. But it really only has simple strings for data structures. For simple things it is great. For more complex things then languages like Perl (which includes Ruby, Python and Awk) are better since they have complex data structures and can do virtually anything you can think of doing without language limitations. Awk is the original little scripting language from way back and it is therefore the most standard and portable. If I am doing something relatively simple then I use awk since it can be written to run everywhere. But it does have some quirks that make it difficult to use for larger programs. Perl then also appeared on the scene some time later. It combined a lot of syntax from the shell, sed, grep, awk, and other utilities. It's popularity has probably peaked. It was more popular a few years ago. The syntax isn't clean and it has a lot of special cases. This has caused many people to like the even newer languages of Python and Ruby which have more recently appeared on the scene. How to choose? If 'grep -F -f file2 file1 > file3' does what you want then I would definitely use the shell and stop there. If not then I would move on to Perl. Or others would suggest either Python or Ruby. Bob
