Re: [weewx-development] Long report generation time when using series

David Bätge Tue, 14 Jun 2022 18:02:52 -0700

*I tested your skin, and I observed that even with tables disabled, the 
"statistics.htlm" page take 2 minutes to generate, on a Raspberry PI 3B+, 
with a MySQL database containing data from 2005 up to today.*
Thank you for taking the time and sharing this info!


*As far as I know, the Season skin is limiting the history to the current 
year, with an aggregation period of 1 day for all parameters.*
Yes it seems so, I was not aware of that.

*[...] Why not first choosing a given aggregation time, doing one query for 
each parameter, and using it for both charts and table?*
This is a really good idea I also got in the meantime. These are many 
"duplicate" queries, which could be merged into one query. It collides with 
another idea to make the aggregation configurable by the user - individual 
for the graphs and the tables. I will see which direction I will take with 
that, thanks for the good suggestions!

*[...] but if they involve all time, then you really want to make sure 
[...]*
Well, I didn't know that! My all-time queries do only meet the first 
condition. I will update this and report back! Many thanks! 
Tom Keffer schrieb am Mittwoch, 15. Juni 2022 um 01:09:34 UTC+2:

> I'm not exactly sure what queries you are doing, but if they involve all 
> time, then you really want to make sure that you don't have to search the 
> whole database and, instead, use the daily summaries.  To do this:
>
>    1. The aggregation time must be a multiple of a day; and
>    2. The start time must be either on a midnight boundary or the first 
>    timestamp in the database; and
>    3. The end time must either be on a midnight boundary, or the last 
>    timestamp in the database.
>
> If any one of these is not true, then the main archive table must be used, 
> and the query time will be much longer. 
>
> Also, in general, SQLite is much faster than MySQL for these kinds of 
> queries. See the Wiki article *SQLite vs MySQL 
> <https://github.com/weewx/weewx/wiki/SQLite-vs-MySQL>*.
>
> On Thu, Jun 9, 2022 at 9:00 AM David Bätge <[email protected]> wrote:
>
>> Hi guys,
>>  
>> The group description says: "…It is a forum for exchanging architectural 
>> ideas, better ways of doing things...", so here I am to ask how to make 
>> things better :)
>>
>> I recently made my first steps in python and the weewx ecosystem and 
>> build a skin for weewx, see here 
>> https://groups.google.com/g/weewx-user/c/sDK4kzO4vBw or here: 
>> https://github.com/Daveiano/weewx-wdc.
>>
>> What's bothering me is the long time it takes to generate the report. On 
>> my Raspberry Pi 4B with 2GB, it takes about ~50s with data back to March 
>> 2021. Another user reports from a (suspected older model) Raspberry Pi, 
>> which takes about ~180s (data from mid 2020).
>>
>> My skin includes Data Tables to show observations. If I hide these tables 
>> the generation time goes down to ~10s on my Pi (reduction of >75%). So the 
>> main problem seems to be these tables. I implemented them using the series 
>> tag. The output is then processed and rendered via JS.
>>
>> In the cheetah template (I substituded some variables with values for 
>> better understanding, original code is here: 
>> https://github.com/Daveiano/weewx-wdc/blob/1.x/skins/weewx-wdc/includes/data-table-tile.inc
>> ):
>>
>> var tableRows = $get_table_rows(['outTemp', 'barometer', 'rain', 
>> 'humidity', ...], $span($hour_delta=some_hour_delta, 
>> $week_delta=some_week_delta), 'year');
>>
>> Search List extension (
>> https://github.com/Daveiano/weewx-wdc/blob/1.x/bin/user/table_util.py#L72
>> ):
>>
>>     def get_table_rows(self, obs, period, precision):
>>         """
>>         Returns table values for use in carbon data table.
>>
>>         Args:
>>             obs (list): $DisplayOptions.get("table_tile_..")
>>             period (obj): Period to use, eg. $year, month, $span
>>             precision (string): Day, week, month, year, alltime
>>
>>         Returns:
>>             list: Carbon data table rows.
>>         """
>>         carbon_values = []
>>
>>         for observation in obs:
>>             if getattr(period, observation).has_data:
>>                 series = getattr(period, observation).series(
>>                     aggregate_type=some_aggregate_type,
>>                     aggregate_interval=some_aggregate_interval,
>>                     time_series='start',
>>                     time_unit='unix_epoch'
>>                 ).round(some_rounding)
>>
>>                 for start, data in zip(series.start, series.data):
>>                     cs_time = datetime.fromtimestamp(start.raw)
>>                     # The current series item by time.
>>                     cs_item = list(filter(
>>                         lambda x: (x['time'] == cs_time.isoformat()),
>>                         carbon_values
>>                     ))
>>
>>                     if len(cs_item) == 0:
>>                         carbon_values.append({
>>                             "time": cs_time.isoformat(),
>>                             observation: data.raw,
>>                             'id': start.raw
>>                         })
>>                     else:
>>                         cs_item = cs_item[0]
>>                         cs_item_index = carbon_values.index(cs_item)
>>                         cs_item[observation] = data.raw if data.raw is 
>> not None else "-"
>>                         carbon_values[cs_item_index] = cs_item
>>
>>         # Sort per time
>>         carbon_values.sort(
>>             key=lambda item: datetime.fromisoformat(item['time'])
>>         )
>>
>>         return carbon_values
>>
>> Aggregate intervals are calculated via:
>>
>>         if precision == 'day':
>>         return 900 * 8  # 2 hours
>>
>>     if precision == 'week':
>>         return 900 * 24  # 6 hours
>>
>>     if precision == 'month':
>>         return 900 * 48  # 12 hours
>>
>>     if precision == 'year' or precision == 'alltime':
>>         return 3600 * 24  # 1 day
>>
>> You see what I did there: I call the series method on every observation 
>> and then I am adding the values to an array for use in JS, something like 
>> this is returned:
>>
>> [
>> {time: '2022-06-08T02:40:00', outTemp: 13.6, id: 1654648800, outHumidity: 
>> 92.8, barometer: 1010.5, …},
>> {time: '2022-06-08T04:40:00', outTemp: 13.4, id: 1654656000, outHumidity: 
>> 88.6, barometer: 1010.5, …},
>> {...},
>> ...
>> ]
>>
>> I am guessing these multiple series calls are resulting in a long 
>> generation time because the DB gets hit multiple times? I am using the 
>> SQLite DB, would using MySQL make any difference? I'm not a database expert 
>> but I think it should?
>>
>> I would very much appreciate any suggestions to make this more 
>> performant, perhaps I missed some smart weewx utility helper?
>>
>> Just for reference: https://github.com/Daveiano/weewx-wdc/issues/14
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "weewx-development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/weewx-development/8296590d-ba9c-453d-a22d-71a8d7baad59n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/weewx-development/8296590d-ba9c-453d-a22d-71a8d7baad59n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-development/f2202824-de55-48ad-ab93-6ef46e26467dn%40googlegroups.com.

Re: [weewx-development] Long report generation time when using series

Reply via email to