Hi, I don't think that filenames are directly available but I do something like this in order to get them (I did not try with Pig 0.7+ yet):
Create a new loader inheriting from PigStorage and get the "location" path of the data. Then either: - print it if everything happens in the same task - append it in each records Hope this helps, Romain On Thu, Oct 21, 2010 at 9:57 AM, Guy Bayes <[email protected]> wrote: > We have a job that processes several hundred files in a directory > > We generally glob the directory in a single load statement > > Sometimes the jobs chokes on a bad row in a single file > > I could have sworn that pig printed the file name of the chunks it is > processing in the task log but cannot see it > > Does anyone know under what conditions file names are printed, or how to > find the file that is causing the issues? > > Thanks > Guy > > >
