Hey all - I am trying to demonstrate a neat use case. Using audit logs in MapR, I'd like to be able to point Drill at the directory, and just go, no loading of data, just go.
The problem I am having is how to describe the path. First how logs are stored. >From the base of MapRFS /var/mapr/local/node1/audit/*.json /var/mapr/local/node2/audit/*.json /var/mapr/local/node3/audit/*.json /var/mapr/local/node4/audit/*.json /var/mapr/local/node5/audit/*.json So as you can see, they could be in multiple directories. I'd like to be able to query all the logs at once without moving the files. (Not sure if this is possible) Anywho, here is what I've tried use dfs.`default`; select * from `var/mapr/local/*/audit/*.json` limit 10; and select * from `var/mapr/local/*/audit/*` limit 10; This gave me an odd "Range must not be empty, but was [0ΓΆΓΏ0)" message. (I do have empty json files... is this an issue? Then I tried select * from `var/mapr/local` where dir1 = 'audit' limit 10; and that gave me Validation Error of "Relative path in absolute URI: clustermetrics.2015-10.07.01:00:00" This is interesting, as that file is in /var/mapr/local/nodeX/metrics/ and based on the dir1 clause, shouldn't even be checked, ( I wonder if this is related to https://issues.apache.org/jira/browse/DRILL-3759?) Note I've tried all the queries with and without a leading / (can't tell if that is needed or not) Any other thoughts on how I can query these files? MapR folks, this would be an OUTSTANDING use case for showing off auditing and drill. And I see there is a blog at https://www.mapr.com/blog/changing-game-when-it-comes-auditing-big-data-part-1 which teases at showing off the the power of drill + auditing, however I see no part 2 and I am chompin at the bit to show this off as part of the PoC :)
