The data structure I am working on has a directory structure that is
several levels deep. At the deepest level, there are over 10,000
directories. Only the lowest-level directories contain files.

Each of the lowest-level directories contains one file labeled ' log.gz',
and several sequentially-labeled files named
'ep_call00000_utt00001.wav.gz', ep_call00000_utt00002.wav.gz', etc. There
are some other files in this lowest-level directory that we can ignore.

I want to extract all the log.gz files and all the wav.gz files in all
these thousands of lowest-level directories and put them all in a single
directory. However, all the log files have the same name, and the wav.gz
files will also have conflicting names. So I need to rename the files
before putting them all in one directory.

In addition, I need to keep the association between the log file in each
directory, and the various wav.gz files that were also in that same
directory.
To do this, I would like to rename each log.gz file to a number which is an
integer between 1 & 100,000, labeled sequentially. I would also like that
same integer to replace the 'ep_call0000' portion of the wav.gz file name
of each of the wav.gz files which are in the same directory as the log.gz
file, while maintaining the sequential numbering of the last part of the
wav.gz name. Each lowest-level directory would have its files renamed with
the next higher integer.

The files in the first lowest-level directory are named 'log.gz',
'ep_call00000_utt00001.wav.gz',and  ep_call00000_utt00002.wav.gz'.  We
would rename these files to:  100000log.gz, 1000000_utt00001.wav.gz', and
100000_utt00002.wav.gz', and then put the renamed files in the new common
directory.

The files in the second lowest-level directory are named  'log.gz',
'ep_call00000_utt00001.wav.gz',  'ep_call00000_utt00002.wav.gz', and
'ep_call00000_utt00003.wav.gz'.  We would rename these files to:
100001log.gz, 1000001_utt00001.wav.gz', 100001_utt00002.wav.gz'', and
'100001_utt00003.wav.gz', then put the renamed files in the new common
directory.

We would continue the rename/copy process until all of the log & wav files
had been renamed and copied to the new directory.

Is there a simple way to do this in J, or should I be looking at a
command-line batch file for this kind of job?

Skip
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to