Hi all On the lines of s4PigWrapper, I was thinking about writing a s4 application master to host s4 piper inside Hadoop Yarn. This could be useful not only for reading data stored in hadoop ( to build or train a model)... But we could make use of the resource manager to deploy s4 instances in remote machine and monitor them. In short, we could make use of most of the resource management , scheduling and other good stuff in Yarn.
If this seems worth experimenting... I can raise a jira to track this... Any thoughts? - ./Zahoor@iPad
