On 29 January 2011 13:43, Jacob Perkins <[email protected]> wrote: > > Write a map only wukong script that parses the json as you want it. See > the example here: > > > http://thedatachef.blogspot.com/2011/01/processing-json-records-with-hadoop-and.html > > Hi Jacob,
Thanks very much for helping me out. I haven't heard of Wukong before. I am a bit concerned though by adding Ruby into my tool stack as well as Pig. It seems like a step too far. Presumably I have to distribute Ruby and Wukong across all my job nodes in the same way as if I were writing perl or C++ streaming programs. With STREAMing - the script is launched once per file, right, not once per record? Alex
