John, We can get anywhere from 1 to 100 million rows per partition, and these daily partitions help enormously when analyzing the busier sensors by constraining the queries to just a few days when that's feasible. Almost all my regularly-scheduled processes are built to work incrementally, one day at a time. Besides speeding up the queries enormously it also results in a huge cost savings - by allowing us to use a smaller cluster.
But the number of sensors is increasing, so I'll be changing the partitioning strategy to move away from sensors to sensor-groups to reduce the total number of partitions. I haven't been able to benchmark the differences, just considering warnings to try to stay below 30,000 total partitions/table. I'm looking at one of two different date_id formats: either a four-digit monotonically-increasing number, or an iso8601 basic format of yyyymmdd as you point out above. That's a choice between a 16-bit that requires a join against a tiny & useful table vs 32-bit that could also be used without a join or with that tiny & useful table. I'll probably benchmark the differences - and glad to accept input on testing & share the results if you're interested. Ken On Fri, Apr 8, 2016 at 3:38 PM, John Russell <[email protected]> wrote: > > On Apr 8, 2016, at 12:06 PM, Ken Farmer <[email protected]> wrote: > > > Ken, what is the extra detail you were hoping to see in the docs? > > More examples of SQL idioms that do/don't result in runtime filters? > > Yep, you got it: because my partitioning depends on multiple columns, > typically often used together, I really need to use a join rather than > subquery. > > And at this point, assuming that the dynamic runtime filters work well & > consistently I'll probably shift our standard partitioning columns away > from year, month, day to date_id - a foreign key into. Unless anyone can > recommend a reason not to. > > > A couple of thoughts on that last point, about changing partition scheme > and joining on a single string column. (From the book-learning > perspective, since I don't run daily reports myself, so I tend not to > partition so granularly.) > > Sounds overall like a reasonable idea, and I'm happy that the notion of > runtime filtering opens this up as a possibility for you. > > I guess you won't consolidate any more data within each partition, since > there will still be the same number of 'days' in each case. I'm always > looking for the opportunity to take partition-by-day tables and switch to > partition-by-month, to try to get decent-sized files in each partition. > Sounds like you have 100M rows / day, but also divided by sensor. How many > rows in a typical partition? > > I've heard many times about better efficiency of integer vs. string for > joins, group by, etc. I wonder if it would be faster and more > memory-efficient overall to encode the day ID as INT, e.g. 20160408. Then > decompose/reformat only at the end for reporting purposes, or keep nicely > formatted string representations in the dim_date dimension table. That is > an experiment I haven't tried with enough volume to see for sure. > > John > > On Fri, Apr 8, 2016 at 12:53 PM, John Russell <[email protected]> > wrote: > >> Ken, what is the extra detail you were hoping to see in the docs? More >> examples of SQL idioms that do/don't result in runtime filters? (Like in >> your example, I could imagine being curious about INNER JOIN as opposed to >> other kinds of joins, ON clause with several ANDed predicates, and BETWEEN >> in the WHERE clause.) >> >> We're planning to add more such examples in upcoming doc refreshes. Just >> want to make sure we cover the kinds of areas you're curious about. Once >> 5.7 is installed somewhere to try out, it's straightforward to verify for >> particular combinations of tables and queries using EXPLAIN and looking for >> the extra lines representing filters. >> >> John >> >> >> On Apr 8, 2016, at 11:39 AM, Mostafa Mokhtar <[email protected]> >> wrote: >> >> Hi Ken, >> >> Yes the query provided should work, I validated that the query below >> works as expected which is fairly similar to the one listed above. >> >> select >> count(*) >> from >> store_sales_mp, >> date_dim >> where >> ss_sold_date_sk = d_date_sk >> and ss_store_sk = d_dow >> and d_year = 1999 >> and d_moy = 1 >> >> >> store_sales_mp is partitioned on two columns : >> | # col_name | data_type >> | ss_sold_date_sk >> | bigint >> | ss_store_sk | bigint >> >> >> >> The query execution summary shows only a small number of rows from >> store_sales_mp qualify the scan >> >> >> +-----------------+--------+----------+----------+--------+------------+----------+---------------+--------------------------------------------+ >> | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | >> Peak Mem | Est. Peak Mem | Detail | >> >> +-----------------+--------+----------+----------+--------+------------+----------+---------------+--------------------------------------------+ >> | 06:AGGREGATE | 1 | 160.66ms | 160.66ms | 1 | 1 | >> 88.00 KB | -1 B | FINALIZE | >> | 05:EXCHANGE | 1 | 137.98us | 137.98us | 19 | 1 | >> 0 B | -1 B | UNPARTITIONED | >> | 03:AGGREGATE | 19 | 151.01ms | 183.65ms | 19 | 1 | >> 8.04 MB | 10.00 MB | | >> | 02:HASH JOIN | 19 | 101.27ms | 185.51ms | 4.82M | 43.20B | >> 2.02 MB | 3.89 KB | INNER JOIN, BROADCAST | >> | |--04:EXCHANGE | 19 | 14.54us | 25.36us | 589 | 181 | >> 0 B | 0 B | BROADCAST | >> | | 01:SCAN HDFS | 1 | 20.39ms | 20.39ms | 31 | 181 | >> 1.82 MB | 64.00 MB | tpcds_15000_decimal_parquet.date_dim | >> | 00:SCAN HDFS | 19 | 37.54ms | 81.18ms | 45.00M | 43.20B | >> 9.84 MB | 0 B | tpcds_15000_decimal_parquet.store_sales_mp | >> >> +-----------------+--------+----------+----------+--------+------------+----------+---------------+--------------------------------------------+ >> >> >> >> Finally from the query profile >> >> HDFS_SCAN_NODE (id=0):(Total: 37.539ms, non-child: 37.539ms, % >> non-child: 100.00%) >> - AverageHdfsReadThreadConcurrency: 0.00 >> - AverageScannerThreadConcurrency: 0.93 >> - BytesRead: 489.47 KB (501221) >> - BytesReadDataNodeCache: 0 >> - BytesReadLocal: 468.42 KB (479663) >> - BytesReadRemoteUnexpected: 21.05 KB (21557) >> - BytesReadShortCircuit: 468.42 KB (479663) >> - DecompressionTime: 0.000ns >> - MaxCompressedTextFileLength: 0 >> - NumColumns: 0 (0) >> - NumDisksAccessed: 4 (4) >> - NumRowGroups: 0 (0) >> - NumScannerThreadsStarted: 5 (5) >> - PeakMemoryUsage: 9.35 MB (9799572) >> - PerReadThreadRawHdfsThroughput: 901.98 MB/sec >> - RemoteScanRanges: 0 (0) >> - RowsRead: 2.37M (2368521) >> - RowsReturned: 2.37M (2368521) >> - RowsReturnedRate: 62.23 M/sec >> - ScanRangesComplete: 3.21K (3207) >> - ScannerThreadsInvoluntaryContextSwitches: 26 (26) >> - ScannerThreadsTotalWallClockTime: 420.999ms >> - MaterializeTupleTime(*): 471.000ns >> - ScannerThreadsSysTime: 82.142ms >> - ScannerThreadsUserTime: 65.777ms >> - ScannerThreadsVoluntaryContextSwitches: 5.41K (5413) >> - TotalRawHdfsReadTime(*): 1.940ms >> - TotalReadThroughput: 136.84 KB/sec >> Filter 0: >> - Files processed: 3.18K (3184) >> - Files rejected: 3.13K (3130) >> - Files total: 3.18K (3184) >> - RowGroups processed: 0 (0) >> - RowGroups rejected: 0 (0) >> - RowGroups total: 0 (0) >> - Rows processed: 0 (0) >> - Rows rejected: 0 (0) >> - Rows total: 0 (0) >> - Splits processed: 4 (4) >> - Splits rejected: 0 (0) >> - Splits total: 4 (4) >> Filter 1: >> - Files processed: 54 (54) >> - Files rejected: 49 (49) >> - Files total: 54 (54) >> - RowGroups processed: 0 (0) >> - RowGroups rejected: 0 (0) >> - RowGroups total: 0 (0) >> - Rows processed: 0 (0) >> - Rows rejected: 0 (0) >> - Rows total: 0 (0) >> - Splits processed: 4 (4) >> - Splits rejected: 0 (0) >> - Splits total: 4 (4) >> >> >> >> Thanks >> Mostafa >> >> >> On Fri, Apr 8, 2016 at 10:04 AM, Ken Farmer <[email protected]> wrote: >> >>> First off - congrats on this new release. It looks like a lot of hard >>> work went into it. >>> >>> I'm really excited about dynamic runtime filters, but have a question >>> about whether it would apply to the following scenario, which seems oddly >>> missing from the documentation. >>> >>> Assume 2 tables: >>> 1. fact_ids table with 50 billion rows partitioned by year, month, day, >>> sensor_id >>> 2. dim_date dimension table with 500 rows, 1 row/day with date_id, and >>> date_name, year, month, day columns >>> >>> Query that works fine today: >>> SELECT f.col1, >>> SUM(somecount) >>> FROM fact_ids f >>> WHERE f.year=2016 AND f.month=4 AND f.day=2 >>> GROUP BY 1 ORDER BY 1 >>> >>> >>> Query that I hope works with 5.7: >>> SELECT f.col1, >>> SUM(somecount) >>> FROM fact_ids f >>> INNER JOIN dim_date d >>> ON f.year = d.year >>> AND f.month = d.month >>> AND f.day = d.day >>> WHERE d.date_name BETWEEN '2016-03-02' AND '2016-04-01' >>> GROUP BY 1 ORDER BY 1 >>> >>> Can anyone confirm that 5.7 will perform dynamic runtime filtering on >>> this second example above? >>> >>> Anything else I should be aware of for cases like this? Or limitations >>> of dynamic runtime filtering? >>> >>> >>> Thanks! >>> >>> >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Impala User" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Impala User" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Impala User" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > > > -- > You received this message because you are subscribed to the Google Groups > "Impala User" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > > > -- > You received this message because you are subscribed to the Google Groups > "Impala User" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. >
