On 31 October 2011 20:22, Ted Dunning <[email protected]> wrote: > On Mon, Oct 31, 2011 at 12:00 PM, Dan Brickley <[email protected]> wrote: > >> On 31 October 2011 17:27, Ted Dunning <[email protected]> wrote: >> > I think this would be very interesting to see. Whether it should be part >> > of Mahout or a separate project is an open question. >> > >> > PIG, is, unfortunately not a real language in the sense of turing >> > completion or extensibility. It is good at what it does, but not at >> being >> > extended to do more. >> >> ...although you can call out to functions defined in Java, Python etc. >> This doesn't make the top level language into a programming language, >> though. Was that your point, Ted? >> Yes. That was the point. Calling out is different from being able to > control the process from the outside in.
I've just found http://wiki.apache.org/pig/TuringCompletePig which has copious notes on ways to address this. Excerpting a little: """Pig Latin is a data flow language. As such it does not offer users control flow and modularity features that are present in general purpose programming languages, including functions, modules, loops, and branches. Given that it is a data flow language adding these constructs is neither straightforward nor reasonable. However, users do want to be able to integrate standard programming techniques of separation and code sharing offered by functions and modules as well as integration of control flow offered by functions, loops, and branches. This document proposes a way to accomplish these goals while preserving Pig Latin's data flow orientation.""" Spoiler alert (wiki page has a lot more detail). Plan seems to be combination of macros (which are now in the language) and "second part of the proposal is to embed Pig Latin scripts in the host scripting language via a JDBC like compile, bind, run model. " I'm not sure how far along that part is... Dan ps. the following 3 links have everything I attempted before with Pig/Mahout integration; not a lot, but it left me intrigued and frustrated in equal measure. http://www.mail-archive.com/[email protected]/msg02848.html https://gist.github.com/1192831 http://search-lucene.com/m/IOfRIc6wGq1&subj=+Unknown+program+chosen+Valid+program+names+are+truncated+list+from+Hadoop+program+driver
