Hi guys, I know there was recently a patch around Mongo slowness with regards to a bug in the reader; however, the querying is still fairly slow when compared to Mongo's aggregation framework itself (in our tests 5-10 times slower).
My guess is this is due to the fact we serialize BSON to JSON and then parse JSON to Drill's vectors. I haven't confirmed my hunch, but it seems almost certainly that this would be a cause for potential performance loss. Ideally, I think the BSON should be parsed directly into Drill's vectors, rather than using the JSON reader. Do you guys think this could be valid, can you think of anything else that might be slowing Mongo down (apart from the obvious network communication/transfer etc.) and could you suggest a way we could validate what part of it is slow?
