Re: Dynamic Runtime Filter Question

Ken Farmer Fri, 08 Apr 2016 15:15:56 -0700

John,

We can get anywhere from 1 to 100 million rows per partition, and these
daily partitions help enormously when analyzing the busier sensors by
constraining the queries to just a few days when that's feasible.  Almost
all my regularly-scheduled processes are built to work incrementally, one
day at a time.   Besides speeding up the queries enormously it also results
in a huge cost savings - by allowing us to use a smaller cluster.


But the number of sensors is increasing, so I'll be changing the
partitioning strategy to move away from sensors to sensor-groups to reduce
the total number of partitions.  I haven't been able to benchmark the
differences, just considering warnings to try to stay below 30,000 total
partitions/table.

I'm looking at one of two different date_id formats:  either a four-digit
monotonically-increasing number, or an iso8601 basic format of yyyymmdd as
you point out above.  That's a choice between a 16-bit that requires a join
against a tiny & useful table vs 32-bit that could also be used without a
join or with that tiny & useful table.  I'll probably benchmark the
differences - and glad to accept input on testing & share the results if
you're interested.

Ken







On Fri, Apr 8, 2016 at 3:38 PM, John Russell <[email protected]> wrote:

>
> On Apr 8, 2016, at 12:06 PM, Ken Farmer <[email protected]> wrote:
>
> > Ken, what is the extra detail you were hoping to see in the docs?
> > More examples of SQL idioms that do/don't result in runtime filters?
>
> Yep, you got it: because my partitioning depends on multiple columns,
> typically often used together, I really need to use a join rather than
> subquery.
>
> And at this point, assuming that the dynamic runtime filters work well &
> consistently I'll probably shift our standard partitioning columns away
> from year, month, day to date_id - a foreign key into.  Unless anyone can
> recommend a reason not to.
>
>
> A couple of thoughts on that last point, about changing partition scheme
> and joining on a single string column.  (From the book-learning
> perspective, since I don't run daily reports myself, so I tend not to
> partition so granularly.)
>
> Sounds overall like a reasonable idea, and I'm happy that the notion of
> runtime filtering opens this up as a possibility for you.
>
> I guess you won't consolidate any more data within each partition, since
> there will still be the same number of 'days' in each case.  I'm always
> looking for the opportunity to take partition-by-day tables and switch to
> partition-by-month, to try to get decent-sized files in each partition.
> Sounds like you have 100M rows / day, but also divided by sensor.  How many
> rows in a typical partition?
>
> I've heard many times about better efficiency of integer vs. string for
> joins, group by, etc.  I wonder if it would be faster and more
> memory-efficient overall to encode the day ID as INT, e.g. 20160408.  Then
> decompose/reformat only at the end for reporting purposes, or keep nicely
> formatted string representations in the dim_date dimension table.  That is
> an experiment I haven't tried with enough volume to see for sure.
>
> John
>
> On Fri, Apr 8, 2016 at 12:53 PM, John Russell <[email protected]>
> wrote:
>
>> Ken, what is the extra detail you were hoping to see in the docs?  More
>> examples of SQL idioms that do/don't result in runtime filters?  (Like in
>> your example, I could imagine being curious about INNER JOIN as opposed to
>> other kinds of joins, ON clause with several ANDed predicates, and BETWEEN
>> in the WHERE clause.)
>>
>> We're planning to add more such examples in upcoming doc refreshes.  Just
>> want to make sure we cover the kinds of areas you're curious about.  Once
>> 5.7 is installed somewhere to try out, it's straightforward to verify for
>> particular combinations of tables and queries using EXPLAIN and looking for
>> the extra lines representing filters.
>>
>> John
>>
>>
>> On Apr 8, 2016, at 11:39 AM, Mostafa Mokhtar <[email protected]>
>> wrote:
>>
>> Hi Ken,
>>
>> Yes the query provided should work, I validated that the query below
>> works as expected which is fairly similar to the one listed above.
>>
>> select
>>     count(*)
>> from
>>     store_sales_mp,
>>     date_dim
>> where
>>     ss_sold_date_sk = d_date_sk
>>         and ss_store_sk = d_dow
>>         and d_year = 1999
>>         and d_moy = 1
>>
>>
>> store_sales_mp is partitioned on two columns :
>>   | # col_name                   | data_type
>>                                                   | ss_sold_date_sk
>>          | bigint
>>                         | ss_store_sk                  | bigint
>>
>>
>>
>> The query execution summary shows only a small number of rows from
>> store_sales_mp qualify the scan
>>
>>
>> +-----------------+--------+----------+----------+--------+------------+----------+---------------+--------------------------------------------+
>> | Operator        | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows |
>> Peak Mem | Est. Peak Mem | Detail                                     |
>>
>> +-----------------+--------+----------+----------+--------+------------+----------+---------------+--------------------------------------------+
>> | 06:AGGREGATE    | 1      | 160.66ms | 160.66ms | 1      | 1          |
>> 88.00 KB | -1 B          | FINALIZE                                   |
>> | 05:EXCHANGE     | 1      | 137.98us | 137.98us | 19     | 1          |
>> 0 B      | -1 B          | UNPARTITIONED                              |
>> | 03:AGGREGATE    | 19     | 151.01ms | 183.65ms | 19     | 1          |
>> 8.04 MB  | 10.00 MB      |                                            |
>> | 02:HASH JOIN    | 19     | 101.27ms | 185.51ms | 4.82M  | 43.20B     |
>> 2.02 MB  | 3.89 KB       | INNER JOIN, BROADCAST                      |
>> | |--04:EXCHANGE  | 19     | 14.54us  | 25.36us  | 589    | 181        |
>> 0 B      | 0 B           | BROADCAST                                  |
>> | |  01:SCAN HDFS | 1      | 20.39ms  | 20.39ms  | 31     | 181        |
>> 1.82 MB  | 64.00 MB      | tpcds_15000_decimal_parquet.date_dim       |
>> | 00:SCAN HDFS    | 19     | 37.54ms  | 81.18ms  | 45.00M | 43.20B     |
>> 9.84 MB  | 0 B           | tpcds_15000_decimal_parquet.store_sales_mp |
>>
>> +-----------------+--------+----------+----------+--------+------------+----------+---------------+--------------------------------------------+
>>
>>
>>
>> Finally from the query profile
>>
>>       HDFS_SCAN_NODE (id=0):(Total: 37.539ms, non-child: 37.539ms, % 
>> non-child: 100.00%)
>>          - AverageHdfsReadThreadConcurrency: 0.00
>>          - AverageScannerThreadConcurrency: 0.93
>>          - BytesRead: 489.47 KB (501221)
>>          - BytesReadDataNodeCache: 0
>>          - BytesReadLocal: 468.42 KB (479663)
>>          - BytesReadRemoteUnexpected: 21.05 KB (21557)
>>          - BytesReadShortCircuit: 468.42 KB (479663)
>>          - DecompressionTime: 0.000ns
>>          - MaxCompressedTextFileLength: 0
>>          - NumColumns: 0 (0)
>>          - NumDisksAccessed: 4 (4)
>>          - NumRowGroups: 0 (0)
>>          - NumScannerThreadsStarted: 5 (5)
>>          - PeakMemoryUsage: 9.35 MB (9799572)
>>          - PerReadThreadRawHdfsThroughput: 901.98 MB/sec
>>          - RemoteScanRanges: 0 (0)
>>          - RowsRead: 2.37M (2368521)
>>          - RowsReturned: 2.37M (2368521)
>>          - RowsReturnedRate: 62.23 M/sec
>>          - ScanRangesComplete: 3.21K (3207)
>>          - ScannerThreadsInvoluntaryContextSwitches: 26 (26)
>>          - ScannerThreadsTotalWallClockTime: 420.999ms
>>            - MaterializeTupleTime(*): 471.000ns
>>            - ScannerThreadsSysTime: 82.142ms
>>            - ScannerThreadsUserTime: 65.777ms
>>          - ScannerThreadsVoluntaryContextSwitches: 5.41K (5413)
>>          - TotalRawHdfsReadTime(*): 1.940ms
>>          - TotalReadThroughput: 136.84 KB/sec
>>         Filter 0:
>>            - Files processed: 3.18K (3184)
>>            - Files rejected: 3.13K (3130)
>>            - Files total: 3.18K (3184)
>>            - RowGroups processed: 0 (0)
>>            - RowGroups rejected: 0 (0)
>>            - RowGroups total: 0 (0)
>>            - Rows processed: 0 (0)
>>            - Rows rejected: 0 (0)
>>            - Rows total: 0 (0)
>>            - Splits processed: 4 (4)
>>            - Splits rejected: 0 (0)
>>            - Splits total: 4 (4)
>>         Filter 1:
>>            - Files processed: 54 (54)
>>            - Files rejected: 49 (49)
>>            - Files total: 54 (54)
>>            - RowGroups processed: 0 (0)
>>            - RowGroups rejected: 0 (0)
>>            - RowGroups total: 0 (0)
>>            - Rows processed: 0 (0)
>>            - Rows rejected: 0 (0)
>>            - Rows total: 0 (0)
>>            - Splits processed: 4 (4)
>>            - Splits rejected: 0 (0)
>>            - Splits total: 4 (4)
>>
>>
>>
>> Thanks
>> Mostafa
>>
>>
>> On Fri, Apr 8, 2016 at 10:04 AM, Ken Farmer <[email protected]> wrote:
>>
>>> First off - congrats on this new release.  It looks like a lot of hard
>>> work went into it.
>>>
>>> I'm really excited about dynamic runtime filters, but have a question
>>> about whether it would apply to the following scenario, which seems oddly
>>> missing from the documentation.
>>>
>>> Assume 2 tables:
>>> 1.  fact_ids table with 50 billion rows partitioned by year, month, day,
>>> sensor_id
>>> 2.  dim_date dimension table with 500 rows, 1 row/day with date_id, and
>>> date_name, year, month, day columns
>>>
>>> Query that works fine today:
>>>     SELECT f.col1,
>>>            SUM(somecount)
>>>     FROM fact_ids f
>>>     WHERE f.year=2016 AND f.month=4 AND f.day=2
>>>     GROUP BY 1  ORDER BY 1
>>>
>>>
>>> Query that I hope works with 5.7:
>>>      SELECT f.col1,
>>>             SUM(somecount)
>>>      FROM fact_ids f
>>>           INNER JOIN dim_date d
>>>               ON f.year   = d.year
>>>               AND f.month = d.month
>>>               AND f.day   = d.day
>>>      WHERE d.date_name BETWEEN '2016-03-02' AND '2016-04-01'
>>>      GROUP BY 1  ORDER BY 1
>>>
>>> Can anyone confirm that 5.7 will perform dynamic runtime filtering on
>>> this second example above?
>>>
>>> Anything else I should be aware of for cases like this?  Or limitations
>>> of dynamic runtime filtering?
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Impala User" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Impala User" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Impala User" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Impala User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Impala User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>

Re: Dynamic Runtime Filter Question

Reply via email to