Sorry Merlijn! Forgot about the CTAS to parquet bit :) At least you guys got an almost instant response from those who know. On 17 May 2016 14:13, "Tom Barber" <[email protected]> wrote:
> Yeah Druid is on my todo as well. Samuel intoduced me to his druid contact > about charming it up and then he went quiet. Would be good to get into the > platform so Saiku can leverage it. > > -------------- > > Director Meteorite.bi - Saiku Analytics Founder > Tel: +44(0)5603641316 > > (Thanks to the Saiku community we reached our Kickstart > <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/> > goal, but you can always help by sponsoring the project > <http://www.meteorite.bi/products/saiku/sponsorship>) > > On 17 May 2016 at 14:11, Konstantinos Tsakalozos < > [email protected]> wrote: > >> Hi Merlijn, >> >> Knowing that you are into data streaming with storm, have you looked at >> Druid (http://druid.io/druid.html)? It might be a good fit for your use >> cases. >> >> Cheers, >> Konstantinos >> >> On Tue, May 17, 2016 at 2:45 PM, Merlijn Sebrechts < >> [email protected]> wrote: >> >>> Thanks Tom! We'll contact them. >>> >>> >>> >>> Kind regards >>> Merlijn Sebrechts >>> >>> 2016-05-17 11:44 GMT+02:00 Tom Barber <[email protected]>: >>> >>>> Hey Merlijn >>>> >>>> I've not scaled up to 200GB but we did do a 20-30GB HDFS test with >>>> adequate performance and load being spread over drill bits. I guys on the >>>> drill mailing list are pretty good at resolving performance issues though >>>> so you should certainly chat to them, and with backing from the new Drill >>>> startup, MapR tech, Dell and a bunch of other firms, there is a decent >>>> amount of development resource on the platform to getting stuff fixed. >>>> >>>> That said, I'm sure there are other solutions that run faster, Impala >>>> etc, also I come from an OLAP background which is why I hooked up with the >>>> Kylin guys as that would give you an alternative entry point. >>>> >>>> Another reason for drill is the data federation and non hadoop support, >>>> for example I could spin up HDFS, Mongo, and MySQL and have drill hook up >>>> to all 3 of them at the same time and do: >>>> >>>> select * from HDFS.mytable a,MONGODB.mytable b,MySQL.mytable c where >>>> a.c1 = b.c1, b.c2=c.c1 >>>> >>>> and have it return a nice federated query, which is pretty powerful. >>>> >>>> Of course with all this tech YMMV, but personally I've had decent >>>> results with it. >>>> >>>> Tom >>>> >>>> -------------- >>>> >>>> Director Meteorite.bi - Saiku Analytics Founder >>>> Tel: +44(0)5603641316 >>>> >>>> (Thanks to the Saiku community we reached our Kickstart >>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/> >>>> goal, but you can always help by sponsoring the project >>>> <http://www.meteorite.bi/products/saiku/sponsorship>) >>>> >>>> On 17 May 2016 at 10:37, Merlijn Sebrechts <[email protected] >>>> > wrote: >>>> >>>>> Hi Tom >>>>> >>>>> >>>>> Slightly off-topic but have you ever worked with drill? We did some >>>>> tests with a 200GB and 100MB dataset in an hdfs cluster and the >>>>> performance >>>>> we're seeing is so bad drill is unusable for us.. >>>>> >>>>> Some initial debugging revealed that drill isn't able to distribute >>>>> the workload over the cluster. The entire query runs on one server... Have >>>>> you been able to get better performance out of it? >>>>> >>>>> >>>>> >>>>> Kind regards >>>>> Merlijn >>>>> >>>>> >>>>> Op dinsdag 17 mei 2016 heeft Tom Barber <[email protected]> het >>>>> volgende geschreven: >>>>> > Okay so I've been asking around as you all know and we're >>>>> considering this apache specific Juju Charms page so I figured it would be >>>>> useful to roundup which communities I have spoken to who have shown >>>>> definite interest in collaboration. >>>>> > We have: >>>>> > Apache Bigtop (we all know about) >>>>> > Apache Zeppelin (we all know about) >>>>> > Apache Karaf >>>>> > Apache Nutch >>>>> > Apache OODT >>>>> > Apache Joshua (Incubating) >>>>> > Apache Kylin >>>>> > I'm sure there will be more, and probably some I've just forgotten >>>>> about or other people spoke to, but I think thats a pretty good start. >>>>> > As me and Kevin also discussed Drill is also a pretty important one >>>>> from a personal perspective as it offers the best (IMHO) route to getting >>>>> SQL over a bunch of your NOSQL charms with minimal effort, which then >>>>> helps >>>>> Saiku and any other BI tooling you guys get into the platform. Its great >>>>> having all the big data stuff, but we need ways for end users to get this >>>>> stuff back out! >>>>> > >>>>> > Tom >>>>> > -------------- >>>>> > Director Meteorite.bi - Saiku Analytics Founder >>>>> > Tel: +44(0)5603641316 >>>>> > (Thanks to the Saiku community we reached our Kickstart goal, but >>>>> you can always help by sponsoring the project) >>>>> >>>> >>>> >>> >>> -- >>> Juju mailing list >>> [email protected] >>> Modify settings or unsubscribe at: >>> https://lists.ubuntu.com/mailman/listinfo/juju >>> >>> >> >> >> -- >> Konstantinos Tsakalozos >> > >
-- Juju mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju
