Hi, First i want to say MRQL is really attracting me, i can see benefit of it over hive, impala, etc.
Recently, i was working on opensource project called Zeppelin ( http://zeppelin-project.org). Last few days, i tried to make MRQL driver for Zeppelin. Result are https://github.com/nflabs/zeppelin-driver-mrql. it's still experimental, but it's working well. I attached screenshot, and you'll see how it works. [image: Inline image 1] While implementing MRQL driver for Zeppelin, I've found few tasks to improve MRQL. *1. Source code structure* a. Currently whole source code stays in 'src' directory under project root. I think it's more common to make source code stay under each submodule. ex) /core/src/main/java /gen/src/main/java /spark/src/main/java .... b. Also it's quite common to put source code under package name'd directory. ex) if mrql.java and package is org.apache.mrql then place this file under /core/src/main/java/org/apache/mrql/mrql.java *2. Unit testing* MRQL does not provide any unittest. I think that'll slow down development process and make things hard to change/verify. While MRQL itself looks like pretty much unittest friendly - for example it support 'memory' mode to evaluate query - it'll not be difficult to add some unittests. *3. Static everywhere* I didn't deeply understand source code. But i can see a lot of static variables. They're everywhere and make things difficult. a) Difficult to understand source code. b) Can not run in parallel (not thread-safe) Currently MRQL runs using commandline and in this case, thread-safety is not a big problem. but for the people who want to embed MRQL, it'll be trouble. If i can get some feedback about tasks i listed, it'll be great. After discussion, i hope i can spend some time for improving MRQL. Thanks. -------- Best, moon
