That's great, I'll review it now!

Russell Jurney http://datasyndrome.com

On Apr 6, 2012, at 7:03 AM, Shasha Liu <grassons...@gmail.com> wrote:

> Hi Russell,
> 
> Based on the email discussions, I wrote my proposal of this pig visualizer 
> project and submit it  onto google-melange. Please take a look at it at your 
> convenience, and it would also appreciated a lot if further feedback/comments 
> could be provided.
> 
> Thank you very much.
> Best,
> 
> On Sun, Mar 25, 2012 at 9:25 PM, Russell Jurney <russell.jur...@gmail.com> 
> wrote:
> I suggest you create a simple, minimal web application that visualizes a pig 
> script file each time a url with the script filename is loaded.  
> 
> For instance, the process to use the tool might go like this:
> 
> 1) Run pigvisualizer.(pl/py/rb) locally, at the start of your pig work session
> 2) Create a new pig script at /my/dif/filename.pig
> 3) Open http://localhost:4567/pigviz/my/dir/filename.pig in a web browser
> 4) See a javascript-based visualization of your pig script
> 5) Reload this web page each time you want to see a new visualization OR have 
> to page try to reload the file periodically
> 
> There are several sources of data:
> 
> 1) Start a pig session, via grunt,PigServer or HCatalog, and use 
> ILLUSTRATE/EXPLAIN.  An old example of doing this is available at 
> https://github.com/rjurney/Cloud-Stenography
> 2) Use the explain or -dot commands from pig command line. In looking at the 
> dot output, the graph is not as helpful as I had thought :(
> 3) Use the PigPen code to get ILLUSTRATE data for visualization
> 
> The ideal situation is that you get the data plan via EXPLAIN, and sample 
> data via ILLUSTRATE, and combine them to produce an even better version of 
> figure 2 in the paper 
> http://infolab.stanford.edu/~olston/publications/sigmod09.pdf 
> 
> <image.png>
> 
> As to the presentation of the data in an interface, I suggest you AVOID 
> eclipse and the UI code to PigPen, as there is little utility in having this 
> visualization there.  Not all Pig users use Eclipse, and there is little 
> utility in editing scripts in the diagrams.  There is great utility in 
> visualizing, understanding and debugging this way, but not so much in editing.
> 
> On the other hand, anyone can edit Pig in their favorite tool and view their 
> pig graph in a simple web application on their localhost by directing a web 
> browser at it.  This is why a simple, small web application seems best. You 
> can use ruby/sinatra or python/bottle/flask or perl/catalyst to make a simple 
> web app.  Check out sigma.js for graph visualization: 
> http://sigmajs.org/examples.html or http://neyric.github.com/wireit/ for 
> something more fully featured.
> 
> Perhaps the best plan is to fix ILLUSTRATE (see 
> http://wiki.apache.org/pig/ExampleGenerator and talk to the guys at 
> mortardata.com who have a patch for this), and edit the PigPen code to remove 
> the Eclipse dependencies and have it output simple JSON for a web application 
> to consume.  It could write to a file, or you could create a simple web 
> service that publishes JSON for the current pig session.
> 
> Once we have JSON of ILLUSTRATE... getting a web visualization is easy.  I 
> can help, I've done it before in Cloud Stenography by parsing data in Grunt.  
> Which you could do, btw.  Old Perl code is available on github (see above 
> link).
> 
> Interested in thoughts of others.
> 
> On Fri, Mar 23, 2012 at 11:21 PM, Shasha Liu <grassons...@gmail.com> wrote:
> Hi Daniel,
> 
> Thanks a lot for the reply.
> I installed the latest Pig and read through the book of "programming in pig".
> I manged to use "-dot -out filename" to produce three graphs in dot file 
> format.
> 
> Based on the existing dot file, my next question is what is the requirement 
> regarding a better visualizer?
> Are we going to generate a picture (e.g., .png) for different plans (logical 
> plan, physical plan, map reduce plan), or provide a web interface to 
> visualize these graphs of plans?
> 
> Best regards,
> -- 
> Shasha(Amy) Liu 
> 
> 
> On Sun, Mar 18, 2012 at 3:30 AM, Daniel Dai <da...@hortonworks.com> wrote:
> See comments inline.
> 
> On Sat, Mar 17, 2012 at 6:52 AM, grassonsand <grassons...@gmail.com> wrote:
> > Dear all,
> >
> > I am a Ph.D. student in Computer Science and have 4-year Java programming
> > experience focusing on Java Web development.
> > In the candidate projects in PIG, I am interested in PIG-2586 (A better
> > plan/data flow visualizer) and PIG-2599 (Mavenize Pig).
> >
> > In my on-going research project, I am in charge of (1). web user interface
> > development and (2). build system. Now I am working on adding hadoop
> > capability to the project. The main reason I am interested in the PIG
> > project is that I can make a contribution to the PIG community based on my
> > previous experience,  and learn from the participant in GSoC this year and
> > benefit my on-going research project at the same time.
> >
> > (1). User interface development
> > I have used several graphic libraries to visualize semantic data and our own
> > data set, e.g., Jung, graphviz, BIRT, and several plot plugins in jquery.
> > Therefore, I am interested in working on a new tool for PIG visualizer.
> > After looking through the bug issue, I have several questions:
> >    (i) As both swing and javascript are mentioned, is this project a web or
> > standalone application?
> >    (ii) As ruby-graphviz is included, Is ruby required for this project?
> 
> I envision two visualize components in Pig. One is a lightweight
> visualizer invoked by Grunt, which should be fast and concise, and
> integrated into explain command. The other is a standalone composer
> similar to PigPen, which should be much powerful. PIG-2586 is intended
> to track the first, but Russel's comment is talking about the second.
> Both are acceptable as a GSoC project. I leave it to Russel.
> 
> >
> > (2). Build system
> > The code base of my research project is 40K loc and the build script was
> > written in Ant. Part of my duty is to convert the ant build script to maven
> > and maintain the build script. Therefore, Mavenize Pig is of interest to me
> > too. The build.xml in PIG project is more complicated than the one I worked
> > before. It includes ant, maven and ivy. Do we need to use maven to do all
> > the tasks and get rid of all the dependency on ant, maven and ivy?
> 
> Yes
> 
> >
> >  Best regards
> >  Shasha(Amy) Liu
> 
> 
> 
> 
> 
> -- 
> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com
> 
> 
> 
> -- 
> Shasha(Amy) Liu

Reply via email to