Hi,
i hav enclosed a basic C programming problem. I need a fully working
code in C(using the commands given in ANSI C). Do reply asap. Thanks
so much.
PROBLEM:
FILE COMPARATOR
A UNIX file can be logically viewed as a set of ASCII character
strings separated be one or more new line â\nâ characters. In this
assignment you are required to build a very simple file comparator
utility which finds out the differences that occur between two input
files, line by line.
You will be given two files as input say file1 and file2. These two
files would be called equivalent if and only if:
Each line in file1 is present in file2 and each line in file2 is
there in file1. Note that the position of lines is immaterial. It
means that the ith line in file1 may be the jth line in file2 and
vice versa. And your program must output following results based on
given files.
For e.g. following two files are equivalent according to this
definition and your program must be able to identify it
FILE1 FILE2
Iâm here. It doesnât
matter
You go there. Its not
simple to be modest.
Iâm here.
Its not simple to be modest. You go there.
It doesnât matter.
DESIRED OUTPUT:
You program must take the two files as input (command line
arguments) and based on them output following results:
1. Whether the given files are equivalent or not.
2. If files are found to be equivalent as per our definition,
you must only produce the mapping of line matching. It means you
must output like
ith line of file1 is same as jth line of file2.
//i & j may take all allowable values
ie. They may take the values in the range [0,total num of
lines in a file-1].
Notice that newline character has nothing to do with line number. In
above example file1 and file2 both have 4 lines. We ignore the â\nâ
since itâs a character not a character string (line). So it means
that the string âIt doesnât matter.â is the 4th line in file1.
3. If files are not equivalent, then what are all the lines
which are in file1 but not in file2 and which are the lines present
in file2 but are not present in file1. You must print these unpaired
lines along with line number and the file name where they are coming
from. A small extract from output will be something like:
//ith line in file1 is not present in file2
//kth line from file2 is not present in file1
4. Percentage of unpaired lines from both the files. Unpaired
lines are those which donât have the same line in the other file.
It means if file1 out of 4 has 3 lines which do have a corresponding
pair in file2 then this percentage for file1=25%.
IMPORANT RESTRICTION:
You can not make any assumption about size of the input files. Each
file may be as large as or even larger than the size of total main
memory present in the system. So your approach ,algorithm and
implementation must take care of the fact that files may contain N
number of lines. Where N may go well beyond Millions.