Dear Sergio,

I'd traverse the dirs by means of a glob and build the following two hashes

%content
with the dirs $dir as keys and a reference to an array or hash of the numbers 
contained in any file of $dir; in case you cannot guarantee that any number 
occurs only once I would opt for the hash as this would allow to (1) count the 
occurances for each number as the hash value and (2) retrieve a list of the 
distinct numbers by calling the keys builtin on the hash. 

%index 
with the numbers $number as keys and a reference to an array of the dirs 
containing a file that contains $number.

Hope this helps
Mathias


-- 
_____________________________________________________________________
)_______
)_______     Dr. Mathias Kratzer
)_______     Head of Library Network Services
)_______     Bavarian Library Network (BVB) / Head Office
)_______     Bavarian State Library
)_______     Ludwigstraße 16 
)_______     D-80539 München
)_______     Phone# +49 89 28638-2797  |  Fax# +49 89 28638-2605
)____________________________________________________________________


>>> On Freitag, 23. September 2016 at 08:48, Sergio Letuche
<code4libus...@gmail.com> wrote:
> hello community,
> 
> Say we have the following structure in our filesystem:
> 
> dir1
> dir2
> dir3
> dir4
> 
> dir stands for directory of course.
> 
> In dir1, there is a file1.txt that has in it numbers, like below
> 
> 6576576  898798789  5645436549  76567576576  876876876876
> 
> Same goes for dir2. In dir2, there is a file2.txt, that has in it numbers,
> like below
> 
> 6576576  89879878963  56454365492  765675765763  8768768768765
> 
> And so with all the rest of the folders.
> What we need to do, is have a new file (like an index) out of all
> directories and files values, like below:
> 
> dir1;6576576,898798789,5645436549,76567576576,876876876876
> dir2;6576576,89879878963,56454365492,765675765763,8768768768765
> 
> And secondly, another index file, which will have the reverse info
> 
> 6576576;dir1,dir2
> 
> Any ideas on how would you approach this?
> 
> Best

Reply via email to