innovark37 opened a new issue, #39965:
URL: https://github.com/apache/superset/issues/39965
## [SIP] Proposal for Browser Print PDF Dashboard Reports
## Motivation
Superset dashboard PDF reports are generated from browser screenshots. This
is a
reasonable default for visual dashboard snapshots, but it is not ideal for
document-style reports.
The screenshot-based approach has several limitations:
- scrollable widgets can be clipped;
- table widgets are not expanded for PDF output, so reports may include only
the visible rows instead of all table rows;
- large dashboards can be difficult or impossible to export reliably;
- image-based PDFs are often significantly larger than HTML/browser-print
PDFs;
- text inside screenshot-based PDFs cannot be selected or copied as text;
- links inside dashboard content do not behave as clickable PDF links.
Modern headless browsers can generate PDFs directly from rendered HTML using
their print engine. This SIP proposes adding an experimental, feature-flagged
HTML-to-PDF path for dashboard reports. The goal is to let Superset prepare
dashboards for browser printing, so operators can choose between the current
screenshot-to-PDF flow and a new browser-print-to-PDF flow.
## Proposed Change
Add a feature flag that enables an experimental browser-print PDF path for
dashboard reports. A possible flag name:
```python
DASHBOARD_REPORTS_BROWSER_PRINT_PDF = False
```
When the flag is disabled, Superset keeps the existing screenshot-based PDF
behavior.
When the flag is enabled, dashboard PDF reports can use a print-ready
dashboard
rendering mode and the browser PDF API, for example Playwright/Chromium
`page.pdf()`.
### Print-Ready Dashboard Rendering Mode
Superset should introduce a dashboard rendering mode intended for browser
printing. One possible implementation is to expose this through a new
dashboard standalone mode.
In this mode, Superset should:
- render dashboard content in a print-friendly layout;
- remove interactive dashboard controls that are not report content;
- apply print CSS;
- make long widget content available to the browser print engine;
- preserve visible chart error states;
- use predictable report styling independent of interactive editing controls.
The initial print layout can be conservative:
- render dashboard blocks vertically;
- use the available print width for each block;
- preserve dashboard layout order;
- allow widgets with internal scroll containers or hidden overflow content
to expand;
- allow visualizations without a natural content height, such as canvas or
SVG charts, to keep a calculated or dashboard-defined height.
This mode should be isolated from the normal interactive dashboard view.
### Reporting Readiness Lifecycle
The reporting worker needs a reliable signal that the print-ready dashboard
is
ready to be printed. Superset should define an internal frontend/backend
readiness lifecycle for browser-based reports.
The implementation may use a DOM marker, browser evaluation callback, Redux
state, or another internal mechanism.
The readiness lifecycle should cover:
- dashboard metadata loaded;
- dashboard layout rendered;
- report-scope charts loaded or failed with visible error state;
- lazy-loaded charts that are part of the report scope have been triggered;
- the page is ready for `page.pdf()` or equivalent browser print call.
If a chart fails to render, the failure should remain visible in the print
DOM
and should be considered a terminal state for that chart.
### Optional Incremental Printing Capability
Some dashboards are too large to render and print as one DOM tree. Superset
should support an optional incremental printing capability for
future-proofing
large dashboards and large table visualizations.
At a high level, the reporting worker should be able to:
1. ask the frontend to prepare the next printable portion;
2. wait until that portion is ready;
3. print the current DOM to a temporary PDF;
4. repeat until no portions remain;
5. merge temporary PDFs into the final report.
The exact frontend/backend API for this capability should be designed as an
internal Superset reporting contract. It should not be treated as a public
end-user API.
Simple dashboards do not need to use incremental printing. They can be
printed
in a single browser PDF call.
### Table Visualization Behavior
Browser-print PDF generation can improve table output, but table
visualizations
still need explicit support.
Table-like visualizations should be able to provide print-specific behavior:
- remove internal vertical scroll containers;
- avoid fixed interactive heights;
- avoid truncating cell text for print when possible;
- render as regular HTML tables when possible;
- use print CSS for repeated table headers and page breaks;
- support row chunking for large datasets.
For table visualizations that support server-side pagination, Superset should
allow the reporting path to fetch table rows in chunks while preserving the
dashboard filter state, security constraints, sorting, and search state.
The first implementation does not need to solve every visualization type. It
can start with the standard table visualization and expand support over time.
### Backend Reporting Worker Flow
The browser-print PDF worker flow can be:
```text
1. Build dashboard print URL.
2. Open the URL in the authenticated browser session.
3. Wait for the Superset reporting readiness lifecycle.
4. If the dashboard can be printed in one pass, call page.pdf().
5. If the dashboard uses incremental printing:
a. prepare next printable portion;
b. wait for portion readiness;
c. call page.pdf() and save a temporary PDF;
d. repeat until complete;
e. merge temporary PDFs.
6. Store the final PDF using the existing report execution artifact flow.
```
### Fallback Behavior
The existing screenshot-based PDF path should remain available:
- as the default while the feature flag is disabled;
- as a rollback path if the browser-print PDF path fails.
## New or Changed Public Interfaces
### Feature Flag
This SIP proposes a new feature flag that enables the browser-print PDF path
for
dashboard reports. A possible name is:
```python
DASHBOARD_REPORTS_BROWSER_PRINT_PDF = False
```
The exact name can be aligned with Superset feature flag conventions during
implementation.
### Dashboard Print Rendering Mode
Superset may expose a dashboard rendering mode intended for browser printing.
One possible implementation is a new dashboard standalone mode. Other
approaches are possible, including reusing an existing standalone mode if it
can
be safely extended, or enabling print-ready rendering through
reporting-specific
state rather than a URL parameter.
### Reporting Readiness Lifecycle
The frontend and reporting worker need an internal readiness contract for
browser-print reports. The contract should define how the worker determines
that
the dashboard, or the current printable portion of the dashboard, is ready
for
PDF generation.
The exact implementation can be a DOM marker, browser-evaluated JavaScript,
frontend state, or another mechanism.
### Optional Incremental Printing Contract
Large dashboards may require an internal incremental printing contract
between
the frontend and reporting worker. This contract should define:
- how the worker asks the frontend to prepare the next printable portion;
- how the frontend reports that the portion is ready;
- how progress is reported;
- how errors are surfaced;
- how the worker determines that printing is complete.
This should not be treated as a public end-user API.
### Reporting Configuration
The browser-print PDF path may introduce new optional configuration values:
- default PDF page format;
- default orientation;
- default margins;
- maximum PDF pages or printable portions;
- maximum execution time;
- maximum output file size;
- table row chunk size;
- maximum table rows for report generation.
## New dependencies
It is not yet clear whether new dependencies are required.
The basic browser-print path may be able to reuse Superset's existing browser
automation stack. If incremental printing produces multiple temporary PDF
files, the implementation will need PDF merge support. This may be possible
with
an existing dependency, or it may require adding a new maintained PDF
library.
## Migration Plan and Compatibility
This change should be backward compatible.
- Existing screenshot-based PDF reports continue to work.
- The browser-print PDF path is disabled by default.
- Existing report schedules do not need to change.
- No database migration is required for the first implementation.
## Rejected Alternatives
### Keep Screenshot-to-PDF Only
This avoids new rendering complexity but leaves existing limitations around
scrollable content, long dashboards, table output, screenshot tiling, and
image
quality.
### Generate PDFs Directly on the Backend Without Browser Rendering
Backend-only PDF generation would require duplicating frontend visualization
rendering behavior. Superset charts are rendered by frontend plugins and can
depend on theme, layout, browser APIs, and plugin-specific rendering logic.
### Build a Separate Report Builder
A dedicated report builder could provide richer document authoring, but it
is a
larger product area. This SIP focuses on improving existing dashboard
reports.
### Replace Screenshot Reporting Immediately
The browser-print path should not immediately replace screenshot reporting.
It
should be introduced behind a feature flag and expanded as visualization
support matures.
## Implementation Plan
1. Add the feature flag and keep screenshot-to-PDF as the default.
2. Add print-ready dashboard rendering mode.
3. Add print CSS and hide interactive dashboard controls in print mode.
4. Define and implement the internal readiness lifecycle.
5. Add reporting worker support for browser PDF generation.
6. Add optional incremental printing support for large dashboards.
7. Add print support for the standard table visualization.
8. Add safety limits and fallback behavior.
9. Document the feature flag and operational constraints.
## Testing Plan
Testing should cover:
- feature flag disabled: existing screenshot PDF behavior remains unchanged;
- simple dashboard browser-print PDF generation;
- dashboards with several chart types;
- dashboards with chart rendering errors;
- dashboards with standard table visualizations;
- large tables with server-side pagination;
- long dashboards requiring multiple PDF pages;
- fallback to screenshot PDF behavior;
- basic visual acceptance criteria for browser-print PDF output.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]