I created a JIRA for discussion. This could be a huge performance win if it
were possible.
https://issues.apache.org/jira/browse/DRILL-4758
On Fri, Jul 1, 2016 at 12:33 PM, Parth Chandra
wrote:
> This has come up in the past in some other context. At the moment though,
> there is no JIRA for
This has come up in the past in some other context. At the moment though,
there is no JIRA for this.
On Fri, Jul 1, 2016 at 6:10 AM, John Omernik wrote:
> Hey all, some colleagues are looking at this on Impala (IMPALA-2017)and
> asked if Drill could do this. (Late/Lazy Materialization of columns
Hey all, some colleagues are looking at this on Impala (IMPALA-2017)and
asked if Drill could do this. (Late/Lazy Materialization of columns).
While the performance gain on tables with less columns may not be huge ,
when you are looking at really wide tables, with disparate date types, this
can be
Not quite.
With a fix for DRILL_1950, no rows would necessarily be materialized at all
for the filter columns. Rows would only be materialized for the projection
columns when the filter matches.
In some cases, the pushdown might be implemented by fully materializing the
values referenced by the f
Ok, thanks for the information!
Am i right that in case DRILL-1950 would be fixed, Drill would automatically
only materialize only those rows/columns which match the filter ?
If not so, would the late materialization you described for the filter case be
possible to implement with the current Ho
There was a major conflict between the patch and the metadata caching
feature that came in right at the same time (right before it). I believe
there was a discussion about this on the list. It would be great if a
developer could pick this up.
--
Jacques Nadeau
CTO and Co-Founder, Dremio
On Mon, A
On Mon, Apr 11, 2016 at 10:36 AM, Aman Sinha wrote:
> There is a JIRA related to one aspect of this: DRILL-1950 (filter pushdown
> into parquet scan). This is still work in progress I believe.
>
Actually, it looks like there was a patch from the community nearly a year
ago.
Hard to understand
There is a JIRA related to one aspect of this: DRILL-1950 (filter pushdown
into parquet scan). This is still work in progress I believe. Once that
is implemented, the scan will produce the filtered rows only.
Regarding column projections, currently in Drill, the columns referenced
anywhere in th
I just replicated these results. Full table scans with aggregation take
pretty much exactly the same amount of time with or without filtering.
On Mon, Apr 11, 2016 at 8:09 AM, Johannes Zillmann wrote:
> Hey Ted,
>
> Sorry i mixed up row and column!
>
> Queries are like that:
> (1) "SEL
Hey Ted,
Sorry i mixed up row and column!
Queries are like that:
(1) "SELECT * FROM dfs.`myParquetFile` WHERE `id` = 23"
(2) "SELECT id FROM dfs.`myParquetFile` WHERE `id` = 23"
(1) is 14 sec and (2) is 1.5 sec.
Using drill-1.6.
So it looks like Drill is extracting the columns b
Did you mean that you are doing a select to find a single column? What you
typed was row, but that seems out of line with the rest of what you wrote.
If you are truly asking about filtering down to a single row, whether it
costs more to return all of the columns rather than just one from a single
Hey there,
i currently doing some performance measurements on Drill.
In my case its a single parquet file with a single local Drill Bit.
Now in one case i have unexpected results and i’m curious if somebody has a
good explanation for it!
So i have a file with 10 mio rows with 9 columns .
Now i’
12 matches
Mail list logo