Hi all, PigStoarge parsing a csv file, did I get it right :
HDFS_Block -> TextInputFormat -> (Key:offset, Value:line) -> PigStorage -> Tuple -> Mapper ? If so, what are the input/output (key, value) pairs of the mapper ? How does formats like RC/ORC (that promise to read less input) work ? HDFS_Block -> ORCInputFormat (concerned columns only) -> (Key, Value) -> ORCParser ? -> Tuple -> Mapper ? Best regards,
