[weewx-development] Long report generation time when using series

David Bätge Thu, 09 Jun 2022 09:00:45 -0700

Hi guys,
 
The group description says: "…It is a forum for exchanging architectural 
ideas, better ways of doing things...", so here I am to ask how to make 
things better :)


I recently made my first steps in python and the weewx ecosystem and build 
a skin for weewx, see here 
https://groups.google.com/g/weewx-user/c/sDK4kzO4vBw or here: 
https://github.com/Daveiano/weewx-wdc.

What's bothering me is the long time it takes to generate the report. On my 
Raspberry Pi 4B with 2GB, it takes about ~50s with data back to March 2021. 
Another user reports from a (suspected older model) Raspberry Pi, which 
takes about ~180s (data from mid 2020).

My skin includes Data Tables to show observations. If I hide these tables 
the generation time goes down to ~10s on my Pi (reduction of >75%). So the 
main problem seems to be these tables. I implemented them using the series 
tag. The output is then processed and rendered via JS.

In the cheetah template (I substituded some variables with values for 
better understanding, original code is here: 
https://github.com/Daveiano/weewx-wdc/blob/1.x/skins/weewx-wdc/includes/data-table-tile.inc):

var tableRows = $get_table_rows(['outTemp', 'barometer', 'rain', 
'humidity', ...], $span($hour_delta=some_hour_delta, 
$week_delta=some_week_delta), 'year');

Search List extension 
(https://github.com/Daveiano/weewx-wdc/blob/1.x/bin/user/table_util.py#L72):

    def get_table_rows(self, obs, period, precision):
        """
        Returns table values for use in carbon data table.

        Args:
            obs (list): $DisplayOptions.get("table_tile_..")
            period (obj): Period to use, eg. $year, month, $span
            precision (string): Day, week, month, year, alltime

        Returns:
            list: Carbon data table rows.
        """
        carbon_values = []

        for observation in obs:
            if getattr(period, observation).has_data:
                series = getattr(period, observation).series(
                    aggregate_type=some_aggregate_type,
                    aggregate_interval=some_aggregate_interval,
                    time_series='start',
                    time_unit='unix_epoch'
                ).round(some_rounding)

                for start, data in zip(series.start, series.data):
                    cs_time = datetime.fromtimestamp(start.raw)
                    # The current series item by time.
                    cs_item = list(filter(
                        lambda x: (x['time'] == cs_time.isoformat()),
                        carbon_values
                    ))

                    if len(cs_item) == 0:
                        carbon_values.append({
                            "time": cs_time.isoformat(),
                            observation: data.raw,
                            'id': start.raw
                        })
                    else:
                        cs_item = cs_item[0]
                        cs_item_index = carbon_values.index(cs_item)
                        cs_item[observation] = data.raw if data.raw is not 
None else "-"
                        carbon_values[cs_item_index] = cs_item

        # Sort per time
        carbon_values.sort(
            key=lambda item: datetime.fromisoformat(item['time'])
        )

        return carbon_values

Aggregate intervals are calculated via:

        if precision == 'day':
        return 900 * 8  # 2 hours

    if precision == 'week':
        return 900 * 24  # 6 hours

    if precision == 'month':
        return 900 * 48  # 12 hours

    if precision == 'year' or precision == 'alltime':
        return 3600 * 24  # 1 day

You see what I did there: I call the series method on every observation and 
then I am adding the values to an array for use in JS, something like this 
is returned:

[
{time: '2022-06-08T02:40:00', outTemp: 13.6, id: 1654648800, outHumidity: 
92.8, barometer: 1010.5, …},
{time: '2022-06-08T04:40:00', outTemp: 13.4, id: 1654656000, outHumidity: 
88.6, barometer: 1010.5, …},
{...},
...
]

I am guessing these multiple series calls are resulting in a long 
generation time because the DB gets hit multiple times? I am using the 
SQLite DB, would using MySQL make any difference? I'm not a database expert 
but I think it should?

I would very much appreciate any suggestions to make this more performant, 
perhaps I missed some smart weewx utility helper?

Just for reference: https://github.com/Daveiano/weewx-wdc/issues/14

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-development/8296590d-ba9c-453d-a22d-71a8d7baad59n%40googlegroups.com.

[weewx-development] Long report generation time when using series

Reply via email to