Boo Christopher beat me to it. Leandro I didn't mention to Merlijn I was
using Parquet files :)

On Tue, May 17, 2016 at 2:54 PM, Leandro Ordonez <
[email protected]> wrote:

> Thank you Jim,
>
> The attachment was this image: https://i.imgsafe.org/7e98f92.png
>
> Then, is it expected for the query I've mentioned before to take that long?
>
>
> On 05/17/2016 03:41 PM, Jim Scott wrote:
>
>> The mailing lists do not support attachments. You can provide a link to a
>> git repo or something like that though.
>>
>> You might want to alter your query to be something like select
>> count(FIELDX) from....
>>
>> On Tue, May 17, 2016 at 8:36 AM, Leandro Ordonez <
>> [email protected]> wrote:
>>
>> Hello,
>>>
>>> I've deployed an HDFS cluster and installed Apache Drill on top of it,
>>> but
>>> found in my case that It takes quite long for Drill to run some queries
>>> on
>>> large JSON files, such as the full Reddit submission corpus (260GB). For
>>> instance, this query: *SELECT COUNT(*) from
>>> dfs.reddit.`RS_full_corpus.json` WHERE selftext <> '' and selftext <>
>>> '[deleted]'**; *took about one hour to run. The other thing I've noticed
>>> is that none of my queries get processed in a "fragmented" way, the query
>>> execution is always in charge of the drilbit acting as the foreman.
>>>
>>> In the attachment you can find the topology that I'm using. Any feedback
>>> on this would be greatly appreciated.
>>>
>>> Thank you very much for your kind attention.
>>>
>>> Best regards,
>>>
>>> --
>>> Leandro Ordonez-Ante
>>> Department of Information Technology
>>> Internet Based Communication Networks and Services (IBCN)
>>> Ghent University - iMinds
>>> Technologiepark Zwijnaarde 15, B-9052 Gent, Belgium
>>> E: [email protected], [email protected]
>>> W: www.ibcn.intec.UGent.be
>>>
>>>
>>>
>>
> --
> Leandro Ordonez-Ante
> Department of Information Technology
> Internet Based Communication Networks and Services (IBCN)
> Ghent University - iMinds
> Technologiepark Zwijnaarde 15, B-9052 Gent, Belgium
> E: [email protected], [email protected]
> W: www.ibcn.intec.UGent.be
>
>

Reply via email to