why not pipe multi-line xml from the executable through another script that understands it?
On Wed, Mar 28, 2012 at 8:24 AM, Ahmed Sobhi <[email protected]> wrote: > I'm streaming data in a pig script through an executable that returns an > xml fragment for each line of input I stream to it. That xml fragment > happens to span multiple lines and I have no control whatsoever over the > output of the executable I stream to > > In relation to Use Hadoop Pig to load data from text file w/ each record on > multiple lines?< > http://stackoverflow.com/questions/6726407/use-hadoop-pig-to-load-data-from-text-file-w-each-record-on-multiple-lines > >, > the answer was suggesting writing a custom record reader. The problem is, > this works fine if you want to implement a LoadFunc that reads from a file, > but to be able to use streaming, it has to implement StreamToPig. > StreamToPig allows you to only read one line at a time as far as I > understood > > Does anyone know how to handle such a situation? > > > http://stackoverflow.com/questions/9910138/is-it-possible-to-use-pig-streaming-streamtopig-in-a-way-that-handles-multiple > > -- > Best Regards, > Ahmed Sobhi > http://about.me/humanzz/bio >
