innovark37 opened a new issue, #39965:
URL: https://github.com/apache/superset/issues/39965

   ## [SIP] Proposal for Browser Print PDF Dashboard Reports
   
   ## Motivation
   
   Superset dashboard PDF reports are generated from browser screenshots. This 
is a
   reasonable default for visual dashboard snapshots, but it is not ideal for
   document-style reports.
   
   The screenshot-based approach has several limitations:
   
   - scrollable widgets can be clipped;
   - table widgets are not expanded for PDF output, so reports may include only 
the visible rows instead of all table rows;
   - large dashboards can be difficult or impossible to export reliably;
   - image-based PDFs are often significantly larger than HTML/browser-print 
PDFs;
   - text inside screenshot-based PDFs cannot be selected or copied as text;
   - links inside dashboard content do not behave as clickable PDF links.
   
   Modern headless browsers can generate PDFs directly from rendered HTML using
   their print engine. This SIP proposes adding an experimental, feature-flagged
   HTML-to-PDF path for dashboard reports. The goal is to let Superset prepare 
   dashboards for browser printing, so operators can choose between the current 
   screenshot-to-PDF flow and a new browser-print-to-PDF flow.
   
   ## Proposed Change
   
   Add a feature flag that enables an experimental browser-print PDF path for
   dashboard reports. A possible flag name:
   
   ```python
   DASHBOARD_REPORTS_BROWSER_PRINT_PDF = False
   ```
   
   When the flag is disabled, Superset keeps the existing screenshot-based PDF
   behavior.
   
   When the flag is enabled, dashboard PDF reports can use a print-ready 
dashboard
   rendering mode and the browser PDF API, for example Playwright/Chromium
   `page.pdf()`.
   
   ### Print-Ready Dashboard Rendering Mode
   
   Superset should introduce a dashboard rendering mode intended for browser
   printing. One possible implementation is to expose this through a new 
dashboard standalone mode.
   
   In this mode, Superset should:
   
   - render dashboard content in a print-friendly layout;
   - remove interactive dashboard controls that are not report content;
   - apply print CSS;
   - make long widget content available to the browser print engine;
   - preserve visible chart error states;
   - use predictable report styling independent of interactive editing controls.
   
   The initial print layout can be conservative:
   
   - render dashboard blocks vertically;
   - use the available print width for each block;
   - preserve dashboard layout order;
   - allow widgets with internal scroll containers or hidden overflow content 
to expand;
   - allow visualizations without a natural content height, such as canvas or 
SVG charts, to keep a calculated or dashboard-defined height.
   
   This mode should be isolated from the normal interactive dashboard view.
   
   ### Reporting Readiness Lifecycle
   
   The reporting worker needs a reliable signal that the print-ready dashboard 
is
   ready to be printed. Superset should define an internal frontend/backend
   readiness lifecycle for browser-based reports.
   
   The implementation may use a DOM marker, browser evaluation callback, Redux
   state, or another internal mechanism.
   
   The readiness lifecycle should cover:
   
   - dashboard metadata loaded;
   - dashboard layout rendered;
   - report-scope charts loaded or failed with visible error state;
   - lazy-loaded charts that are part of the report scope have been triggered;
   - the page is ready for `page.pdf()` or equivalent browser print call.
   
   If a chart fails to render, the failure should remain visible in the print 
DOM
   and should be considered a terminal state for that chart.
   
   ### Optional Incremental Printing Capability
   
   Some dashboards are too large to render and print as one DOM tree. Superset
   should support an optional incremental printing capability for 
future-proofing
   large dashboards and large table visualizations.
   
   At a high level, the reporting worker should be able to:
   
   1. ask the frontend to prepare the next printable portion;
   2. wait until that portion is ready;
   3. print the current DOM to a temporary PDF;
   4. repeat until no portions remain;
   5. merge temporary PDFs into the final report.
   
   The exact frontend/backend API for this capability should be designed as an
   internal Superset reporting contract. It should not be treated as a public
   end-user API.
   
   Simple dashboards do not need to use incremental printing. They can be 
printed
   in a single browser PDF call.
   
   ### Table Visualization Behavior
   
   Browser-print PDF generation can improve table output, but table 
visualizations
   still need explicit support.
   
   Table-like visualizations should be able to provide print-specific behavior:
   
   - remove internal vertical scroll containers;
   - avoid fixed interactive heights;
   - avoid truncating cell text for print when possible;
   - render as regular HTML tables when possible;
   - use print CSS for repeated table headers and page breaks;
   - support row chunking for large datasets.
   
   For table visualizations that support server-side pagination, Superset should
   allow the reporting path to fetch table rows in chunks while preserving the
   dashboard filter state, security constraints, sorting, and search state.
   
   The first implementation does not need to solve every visualization type. It
   can start with the standard table visualization and expand support over time.
   
   ### Backend Reporting Worker Flow
   
   The browser-print PDF worker flow can be:
   
   ```text
   1. Build dashboard print URL.
   2. Open the URL in the authenticated browser session.
   3. Wait for the Superset reporting readiness lifecycle.
   4. If the dashboard can be printed in one pass, call page.pdf().
   5. If the dashboard uses incremental printing:
      a. prepare next printable portion;
      b. wait for portion readiness;
      c. call page.pdf() and save a temporary PDF;
      d. repeat until complete;
      e. merge temporary PDFs.
   6. Store the final PDF using the existing report execution artifact flow.
   ```
   
   ### Fallback Behavior
   
   The existing screenshot-based PDF path should remain available:
   
   - as the default while the feature flag is disabled;
   - as a rollback path if the browser-print PDF path fails.
   
   ## New or Changed Public Interfaces
   
   ### Feature Flag
   
   This SIP proposes a new feature flag that enables the browser-print PDF path 
for
   dashboard reports. A possible name is:
   
   ```python
   DASHBOARD_REPORTS_BROWSER_PRINT_PDF = False
   ```
   
   The exact name can be aligned with Superset feature flag conventions during
   implementation.
   
   ### Dashboard Print Rendering Mode
   
   Superset may expose a dashboard rendering mode intended for browser printing.
   One possible implementation is a new dashboard standalone mode. Other
   approaches are possible, including reusing an existing standalone mode if it 
can
   be safely extended, or enabling print-ready rendering through 
reporting-specific
   state rather than a URL parameter.
   
   ### Reporting Readiness Lifecycle
   
   The frontend and reporting worker need an internal readiness contract for
   browser-print reports. The contract should define how the worker determines 
that
   the dashboard, or the current printable portion of the dashboard, is ready 
for
   PDF generation.
   
   The exact implementation can be a DOM marker, browser-evaluated JavaScript,
   frontend state, or another mechanism.
   
   ### Optional Incremental Printing Contract
   
   Large dashboards may require an internal incremental printing contract 
between
   the frontend and reporting worker. This contract should define:
   
   - how the worker asks the frontend to prepare the next printable portion;
   - how the frontend reports that the portion is ready;
   - how progress is reported;
   - how errors are surfaced;
   - how the worker determines that printing is complete.
   
   This should not be treated as a public end-user API.
   
   ### Reporting Configuration
   
   The browser-print PDF path may introduce new optional configuration values:
   
   - default PDF page format;
   - default orientation;
   - default margins;
   - maximum PDF pages or printable portions;
   - maximum execution time;
   - maximum output file size;
   - table row chunk size;
   - maximum table rows for report generation.
   
   ## New dependencies
   
   It is not yet clear whether new dependencies are required.
   
   The basic browser-print path may be able to reuse Superset's existing browser
   automation stack. If incremental printing produces multiple temporary PDF
   files, the implementation will need PDF merge support. This may be possible 
with
   an existing dependency, or it may require adding a new maintained PDF 
library.
   
   ## Migration Plan and Compatibility
   
   This change should be backward compatible.
   
   - Existing screenshot-based PDF reports continue to work.
   - The browser-print PDF path is disabled by default.
   - Existing report schedules do not need to change.
   - No database migration is required for the first implementation.
   
   ## Rejected Alternatives
   
   ### Keep Screenshot-to-PDF Only
   
   This avoids new rendering complexity but leaves existing limitations around
   scrollable content, long dashboards, table output, screenshot tiling, and 
image
   quality.
   
   ### Generate PDFs Directly on the Backend Without Browser Rendering
   
   Backend-only PDF generation would require duplicating frontend visualization
   rendering behavior. Superset charts are rendered by frontend plugins and can
   depend on theme, layout, browser APIs, and plugin-specific rendering logic.
   
   ### Build a Separate Report Builder
   
   A dedicated report builder could provide richer document authoring, but it 
is a
   larger product area. This SIP focuses on improving existing dashboard 
reports.
   
   ### Replace Screenshot Reporting Immediately
   
   The browser-print path should not immediately replace screenshot reporting. 
It
   should be introduced behind a feature flag and expanded as visualization
   support matures.
   
   ## Implementation Plan
   
   1. Add the feature flag and keep screenshot-to-PDF as the default.
   2. Add print-ready dashboard rendering mode.
   3. Add print CSS and hide interactive dashboard controls in print mode.
   4. Define and implement the internal readiness lifecycle.
   5. Add reporting worker support for browser PDF generation.
   6. Add optional incremental printing support for large dashboards.
   7. Add print support for the standard table visualization.
   8. Add safety limits and fallback behavior.
   9. Document the feature flag and operational constraints.
   
   ## Testing Plan
   
   Testing should cover:
   
   - feature flag disabled: existing screenshot PDF behavior remains unchanged;
   - simple dashboard browser-print PDF generation;
   - dashboards with several chart types;
   - dashboards with chart rendering errors;
   - dashboards with standard table visualizations;
   - large tables with server-side pagination;
   - long dashboards requiring multiple PDF pages;
   - fallback to screenshot PDF behavior;
   - basic visual acceptance criteria for browser-print PDF output.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to