Sorry Merlijn! Forgot about the CTAS to parquet bit :)

At least you guys got an almost instant response from those who know.
On 17 May 2016 14:13, "Tom Barber" <[email protected]> wrote:

> Yeah Druid is on my todo as well. Samuel intoduced me to his druid contact
> about charming it up and then he went quiet. Would be good to get into the
> platform so Saiku can leverage it.
>
> --------------
>
> Director Meteorite.bi - Saiku Analytics Founder
> Tel: +44(0)5603641316
>
> (Thanks to the Saiku community we reached our Kickstart
> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
> goal, but you can always help by sponsoring the project
> <http://www.meteorite.bi/products/saiku/sponsorship>)
>
> On 17 May 2016 at 14:11, Konstantinos Tsakalozos <
> [email protected]> wrote:
>
>> Hi Merlijn,
>>
>> Knowing that you are into data streaming with storm, have you looked at
>> Druid (http://druid.io/druid.html)? It might be a good fit for your use
>> cases.
>>
>> Cheers,
>> Konstantinos
>>
>> On Tue, May 17, 2016 at 2:45 PM, Merlijn Sebrechts <
>> [email protected]> wrote:
>>
>>> Thanks Tom! We'll contact them.
>>>
>>>
>>>
>>> Kind regards
>>> Merlijn Sebrechts
>>>
>>> 2016-05-17 11:44 GMT+02:00 Tom Barber <[email protected]>:
>>>
>>>> Hey Merlijn
>>>>
>>>> I've not scaled up to 200GB but we did do a 20-30GB HDFS test with
>>>> adequate performance and load being spread over drill bits. I guys on the
>>>> drill mailing list are pretty good at resolving performance issues though
>>>> so you should certainly chat to them, and with backing from the new Drill
>>>> startup, MapR tech, Dell and a bunch of other firms, there is a decent
>>>> amount of development resource on the platform to getting stuff fixed.
>>>>
>>>> That said, I'm sure there are other solutions that run faster, Impala
>>>> etc, also I come from an OLAP background which is why I hooked up with the
>>>> Kylin guys as that would give you an alternative entry point.
>>>>
>>>> Another reason for drill is the data federation and non hadoop support,
>>>> for example I could spin up HDFS, Mongo, and MySQL and have drill hook up
>>>> to all 3 of them at the same time and do:
>>>>
>>>> select * from HDFS.mytable a,MONGODB.mytable b,MySQL.mytable c where
>>>> a.c1 = b.c1, b.c2=c.c1
>>>>
>>>> and have it return a nice federated query, which is pretty powerful.
>>>>
>>>> Of course with all this tech YMMV, but personally I've had decent
>>>> results with it.
>>>>
>>>> Tom
>>>>
>>>> --------------
>>>>
>>>> Director Meteorite.bi - Saiku Analytics Founder
>>>> Tel: +44(0)5603641316
>>>>
>>>> (Thanks to the Saiku community we reached our Kickstart
>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
>>>> goal, but you can always help by sponsoring the project
>>>> <http://www.meteorite.bi/products/saiku/sponsorship>)
>>>>
>>>> On 17 May 2016 at 10:37, Merlijn Sebrechts <[email protected]
>>>> > wrote:
>>>>
>>>>> Hi Tom
>>>>>
>>>>>
>>>>> Slightly off-topic but have you ever worked with drill? We did some
>>>>> tests with a 200GB and 100MB dataset in an hdfs cluster and the 
>>>>> performance
>>>>> we're seeing is so bad drill is unusable for us..
>>>>>
>>>>> Some initial debugging revealed that drill isn't able to distribute
>>>>> the workload over the cluster. The entire query runs on one server... Have
>>>>> you been able to get better performance out of it?
>>>>>
>>>>>
>>>>>
>>>>> Kind regards
>>>>> Merlijn
>>>>>
>>>>>
>>>>> Op dinsdag 17 mei 2016 heeft Tom Barber <[email protected]> het
>>>>> volgende geschreven:
>>>>> > Okay so I've been asking around as you all know and we're
>>>>> considering this apache specific Juju Charms page so I figured it would be
>>>>> useful to roundup which communities I have spoken to who have shown
>>>>> definite interest in collaboration.
>>>>> > We have:
>>>>> > Apache Bigtop (we all know about)
>>>>> > Apache Zeppelin (we all know about)
>>>>> > Apache Karaf
>>>>> > Apache Nutch
>>>>> > Apache OODT
>>>>> > Apache Joshua (Incubating)
>>>>> > Apache Kylin
>>>>> > I'm sure there will be more, and probably some I've just forgotten
>>>>> about or other people spoke to, but I think thats a pretty good start.
>>>>> > As me and Kevin also discussed Drill is also a pretty important one
>>>>> from a personal perspective as it offers the best (IMHO) route to getting
>>>>> SQL over a bunch of your NOSQL charms with minimal effort, which then 
>>>>> helps
>>>>> Saiku and any other BI tooling you guys get into the platform. Its great
>>>>> having all the big data stuff, but we need ways for end users to get this
>>>>> stuff back out!
>>>>> >
>>>>> > Tom
>>>>> > --------------
>>>>> > Director Meteorite.bi - Saiku Analytics Founder
>>>>> > Tel: +44(0)5603641316
>>>>> > (Thanks to the Saiku community we reached our Kickstart goal, but
>>>>> you can always help by sponsoring the project)
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Juju mailing list
>>> [email protected]
>>> Modify settings or unsubscribe at:
>>> https://lists.ubuntu.com/mailman/listinfo/juju
>>>
>>>
>>
>>
>> --
>> Konstantinos Tsakalozos
>>
>
>
-- 
Juju mailing list
[email protected]
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Reply via email to