Hi all, I'm new to Pig (and a bit rusty with Java!) and still just playing around with it, nothing serious yet. I might be misunderstanding something important here.
I'm trying to write a custom loader for a custom XML file format, i.e. deserialize the XML into Pig data type. However all the documentation and other code is based on taking a RecordReader and spitting out things from getNext(). Is there anyway to make a custom loader that works on InputStreams or more common java-io-y type stuff? I'd like to use more commonly available XML parsers (which work on these). Since it's XML, line by line parsing doesn't really work. I will just have one input file that will be parsed. Is there some reason why there are no InputStreams? I have also asked this question on StackOverflow: http://stackoverflow.com/questions/8843790/custom-apache-pig-loadfunc-where-can-i-get-the-inputstream-on-the-file -- Rory
signature.asc
Description: OpenPGP digital signature
