> On Sep 10, 2015, at 10:35 AM, Preston Carman <[email protected]> wrote: > > So this post may be a little rabbling, but I hope it starts the discussion. > > Apache VXQuery by default returns the result to the CLI and prints it to > the screen. In practice, I usually pipe the output to a file for review. Do > you think we should add an option to save the result to a file (local or > hdfs)? I think this will become an issue/speed concern as we start running > VXQuery in a Yarn Cluster [1]. Currently the CLI must be running for the > whole query to receive the result. It would be nice to decouple these > processes. Although this creates two issues: how do you know when the query > is complete and how will we save the result. > > Things to discuss: > > alpha: Should we write the result to a file (local or hdfs)? Currently the > result is read and returned to the user through the CLI. The CLI could save > the result to a file instead. (sounds easy)
Indeed, we could pass a command line parameter with a place to put the result in. > bravo: Can writing the result to a file be pushed into the Hyracks job? The > goal would be to allow the CLI to create and send the job while a separate > process read the result once finished. The client be able to disconnect > from the server while the job was running and connect back later to get the > result (no more need for the cli to be in a screen session). I think that we have 2 different points here 1) having the Hyracks job write the result (e.g. by putting a new operator on top of the plan that creates a tmp file on one of the NCs, writes the result to the file, and returns a reference to the file) 2) disconnecting from the CC before the job is done (in that case we’d need a way to communicate the state of the job) > charlie: What is the workflow we would like to see for running a query on a > Yarn VXQuery cluster? See diagram [1]. A question on the diagram: Do we need to run the CLI on the YARN cluster? Generally it seems to me that we could just do option 1 for bravo above and write to HDFS instead of writing to a local file. > [1] > https://docs.google.com/drawings/d/13_kP4Yt1ze_pgqQcbVLmlBOxE6aX0Pmjg3FT2q4XX2k/edit?usp=sharing
