date:20150707

Re: Querying parquet files

2015-07-07 Thread Ted Dunning

How many columns do you have? Do you understand about columnar data stores and how selecting only a single column means that much less data needs to be read? If your data consists, say, of integers, then Drill only needs to read 160MB to satisfy your query which is quite reasonable to be read in

Re: Querying parquet files

2015-07-07 Thread Yousef Lasi

We are currently running(testing) with Veritas CFS (attached to EMC SAN storage) which is visible across 6 servers. We also have a single test MapR node, but that's a small sandbox. The production implementation will be with a 10 node HDFS cluster The data files are 20 GB to 40 GB in size.

Hive version

2015-07-07 Thread Paul Mogren

I see that Drill 1.1.0 declares support for Hive 1.0, which is not yet provided by Amazon EMR. Any chance Hive 0.13 will still work? Can you characterize when 0.13 would or would not work? In general I think users will want to upgrade Drill much more frequently than they are able to upgrade Hive.

Re: Drill 1.1 and partition by

2015-07-07 Thread Steven Phillips

The feature was added late in the release cycle, and it wasn't tested as thoroughly as the default option. I think it should be perfectly ok to use; just be aware that it may lead to decreased performance when running CTAS operations. On the other hand, this could drastically reduce the number of

Re: Querying parquet files

2015-07-07 Thread Ted Dunning

No. A very simple model like that breaks down on many levels. The most important level that reality intrudes in is the fact that your I/O probably can't really be threaded so widely. What kind of storage are you using? How big is your data? Sent from my iPhone On Jul 7, 2015, at 6:38,

Re: Querying parquet files

2015-07-07 Thread Christopher Matta

You might also want to check out the new partitioned Parquet creation that was launched with 1.1.0: https://drill.apache.org/docs/partition-by-clause/ This would increase your read speed if your queries tend to use predicates. Chris Matta cma...@mapr.com 215-701-3146 On Tue, Jul 7, 2015 at

Re: Querying parquet files

Re: Querying parquet files

Hive version

Re: Drill 1.1 and partition by

Re: Querying parquet files

Re: Querying parquet files

6 matches

Site Navigation

Mail list logo

Footer information