We haven't quite hit it with that much data in Mongo. We're compressing Mongo data to Parquet in smaller chunks so we never query Mongo with that much.
We've found it a bit too slow querying Mongo with Drill, mostly I think from the fact that it goes from BSON -> JSON and then the JSON is parsed, rather than parsing the BSON directly. We'd certainly like to do more investigation on that though and see where the bottleneck is. On Tue, Jun 9, 2015 at 10:30 AM, Jacques Nadeau <[email protected]> wrote: > I've been working on an updated version of the 3.0 patch from the original > plugin guys. I'll try to get it uploaded/merged soon. > > I'm still seeing connection issues on larger workloads so I'm waiting to > post until I work through that. Adam, have you had any problems when doing > very large scale queries? (I'm running against a cluster of Mongo nodes > holding > 5tb of data using a large number of nodes and threads per node). > > thx, > Jacques > > On Mon, Jun 8, 2015 at 5:06 PM, Adam Gilmore <[email protected]> > wrote: > > > Just my input here guys. We experienced the exact same issue due to the > > fact that Drill is still using the 2.x Mongo Java driver. Mongo 3.0's > > server does not play nicely with this driver (you cannot see any > > collections). > > > > If it does turn out that you're using Mongo 3.0, then you need to be > using > > the latest driver. > > > > I've already made the patch in my fork which I didn't submit a patch for > > because the 3.x driver was still in beta at the time, but I believe now > > it's out of beta so we should be upgrading to the latest driver. > > > > Let me know and I can post the patch and hopefully that resolves the > issue. > > > > On Tue, Jun 9, 2015 at 8:59 AM, Rangaswamy, Manoharan < > > [email protected]> > > wrote: > > > > > Hi Jacques, > > > > > > We did create a role similar to below in our non-prod instance and it > > > worked connecting to drill. I will deploy this to prod in a couple of > > days > > > and let you know if I run across any issue. > > > > > > Thank you for all your help, > > > Mano > > > -----Original Message----- > > > From: Jacques Nadeau [mailto:[email protected]] > > > Sent: Friday, June 05, 2015 4:30 PM > > > To: [email protected] > > > Subject: Re: MapR Drill - mongodb collections does not show up > > > > > > I think the problem here is you're accessing the system without the > > > listDatabases privilege. There isn't a role that only confers that > > > permission so I suggest you create a new role that has only that > > > privilege. From there, you can add that role to the appropriate users. > > > For example: > > > > > > use admin > > > db.createRole( > > > { > > > role: "schemaAccessor", > > > privileges: [ > > > { resource: { cluster: true }, actions: [ "listDatabases" ] } > > > ], > > > roles: [] > > > } > > > ) > > > > > > use mydbname > > > db.grantRolesToUser( "myuid", [ { role: "listDatabases", db: "admin" > } > > ]) > > > > > > You may also need to confer the listCollections privilege for each > > > database that you want to list collections for. > > > > > > Let me know if this helps resolve your issue. > > > > > > Thanks, > > > Jacques > > > > > >
