It generally depends on what type of Storage mechanism is used. If it's PigStorage() then this information is not encoded into the data.
Assuming that the storage is PigStorage() and that cookie_id is the first field in the data, your load function should look as follows: Data = LOAD '/user/xx/20130523/*' using PigStorage() as (cookie_id: charray, ...); x = FOREACH Data GENERATE cookie_id; So, you not only have to define what Storage function to use, you (may) also have to describe the schema when you load the data. On Tue, Jul 16, 2013 at 2:04 PM, Mix Nin <[email protected]> wrote: > Hi, > > I am trying query a data set on HDFS using PIG. > > Data = LOAD '/user/xx/20130523/*; > x = FOREACH Data GENERATE cookie_id; > > I get below error. > > <line 2, column 26> Invalid field projection. Projected field [cookie_id] > does not exist > > How do i find the column names in the bag "Data" . The developer who > created the file says, it is coookie_id. > Is there any way I could get schema/header for this? > > > Thanks >
