I'm pretty sure they are suppose to be on the Input split of the tasktracker logs aren't they?
For some reason all the Input-Slits are null Input-split file: null Input-split start-offset: -1 Input-split length: -1 thanks Guy On Mon, Oct 25, 2010 at 9:02 AM, Romain Rigaux <[email protected]>wrote: > Hi,thanks > > > I don't think that filenames are directly available but I do something like > this in order to get them (I did not try with Pig 0.7+ yet): > > Create a new loader inheriting from PigStorage and get the "location" path > of the data. Then either: > > - print it if everything hasupposeppens in the same task > - append it in each records > > Hope this helps, > > Romain > > On Thu, Oct 21, 2010 at 9:57 AM, Guy Bayes <[email protected]> wrote: > > > We have a job that processes several hundred files in a directory > > > > We generally glob the directory in a single load statement > > > > Sometimes the jobs chokes on a bad row in a single file > > > > I could have sworn that pig printed the file name of the chunks it is > > processing in the task log but cannot see it > > > > Does anyone know under what conditions file names are printed, or how to > > find the file that is causing the issues? > > > > Thanks > > Guy > > > > > > -- you may be acquainted with the night but i have seen the darkness in the day and you must know it is a terrifying sight...
