Hi I have few tar files in HDFS in a single folder. each file has multiple files in it.
tar1: - f1.txt - f2.txt tar2: - f1.txt - f2.txt (each tar file will have exact same number of files, same name) I am trying to find a way (spark or pig) to extract them to their own folders. f1 - tar1_f1.txt - tar2_f1.txt f2: - tar1_f2.txt - tar1_f2.txt Any help? -- Best Regards, Ayan Guha