Hi Adam,
 We found out an issue in MongoRecordReader (DRILL-1971
<https://issues.apache.org/jira/browse/DRILL-1971>), for slowness. Updated
patch for this.

On Wed, Jan 7, 2015 at 2:15 PM, Adam Gilmore <[email protected]> wrote:

> Yes - sorry - I added the group by to make it do something a bit more than
> just a count - a count returns very quickly.
>
> It does look like it's trying to stream all events into Drill, but either
> way, I only have 1M documents - it should be a bit faster than minutes,
> right?  Whereas I can export all documents using mongoexport, for example,
> in a matter of seconds.
>
> On Wed, Jan 7, 2015 at 10:20 AM, Andries Engelbrecht <
> [email protected]> wrote:
>
> > The query seems a bit strange as the count(*) will only return the number
> > of records, yet there is a group by clause which may confuse the
> optimizer.
> >
> > Look at the query plan and see if the Mongo storage plug in is
> potentially
> > sending all the records to Drill.
> >
> > Mongo will likely do a simple record count that will return quickly.
> >
> > Try the query without the group by clause in Drill, it hold return very
> > quickly. Will also be interesting to compare query plans in Drill.
> >
> > —Andries
> >
> >
> > On Jan 6, 2015, at 4:08 PM, Adam Gilmore <[email protected]> wrote:
> >
> > > Hi all,
> > >
> > > I'm trying to test out Mongo with Drill but seem to be running into
> very
> > > slow performance.
> > >
> > > I have about 1M documents loaded into Mongo, and I'm doing something as
> > > simple as:
> > >
> > > select count(*) from mongo.`connect`.events group by collection;
> > >
> > > where "collection" is a string field in the document.
> > >
> > > This takes minutes to complete, which to me seems very strange.
> > >
> > > Any ideas why this would be that slow?  I can run an identical query
> > > directly on Mongo and it returns in sub-second time.
> >
> >
>



-- 
Kamesh.

Reply via email to