Hello DRM maintainers and community,

I'm reporting a hardware compatibility issue where DisplayID checksum
validation failures prevent access to a panel's full capabilities. I'm
seeking guidance on the appropriate approach to handle this type of issue.

Problem Summary
===============

A CSO T3 eDP display panel (Model: MNE007ZA1-5) with 2880x1800 resolution
cannot access its 120Hz refresh rate capability due to DisplayID checksum
validation failures in the kernel. The validation code treats checksum
mismatches as fatal errors, completely blocking DisplayID functionality.

Steps to Reproduce
==================

1. Boot system with CSO T3 eDP panel (MNE007ZA1-5)
2. Observe kernel logs during amdgpu initialization
3. Check available display modes via standard tools
4. Attempt to access 120Hz mode

Expected vs Actual Behavior
===========================

Expected: Panel should provide access to its full capabilities including
120Hz mode
Actual: Only basic display modes available, 120Hz blocked by validation
failures

System Information
==================

Hardware Configuration:
- Panel: CSO T3 (MNE007ZA1-5), 2880x1800 eDP
- Graphics: AMD Ryzen 7 7840HS with Radeon 780M [1002:15bf]
- System: Lenovo laptop with integrated display

Software Configuration:
- Kernel: Linux 6.17.0-rc5
- Distribution: Fedora Linux 42 (Workstation Edition)
- Display Server: Wayland with GNOME Shell 48.4
- Graphics Driver: amdgpu (built-in)

Problem Evidence - Kernel Logs
===============================

During system boot, repeated DisplayID checksum validation failures occur:

    [    4.741506] [drm] DisplayID checksum invalid, remainder is 248
    [    4.741512] [drm] DisplayID checksum invalid, remainder is 248
    [    4.741514] [drm] DisplayID checksum invalid, remainder is 248
    [    4.741515] [drm] DisplayID checksum invalid, remainder is 248
    [    4.741517] [drm] DisplayID checksum invalid, remainder is 248
    [... pattern repeats 40+ times during amdgpu initialization ...]

Failure Characteristics:
- Frequency: 40+ occurrences per boot
- Timing: During amdgpu driver initialization (~4.74s timeframe)
- Consistency: Always "remainder is 248"
- Impact: Each failure blocks DisplayID processing
- Result: Higher refresh rates unavailable

Hardware Analysis
=================

Complete EDID Data:

00000000  00 ff ff ff ff ff ff 00  36 74 5a 09 00 00 00 00
 |........6tZ.....|
00000010  00 21 01 04 b5 22 16 78  03 ee 95 a3 54 4c 99 26
 |.!...".x....TL.&|
00000020  0f 50 54 00 00 00 01 01  01 01 01 01 01 01 01 01
 |.PT.............|
00000030  01 01 01 01 01 01 40 e7  00 6a a0 a0 67 50 08 98
 |......@..j..gP..|
00000040  08 00 58 d7 10 00 00 18  00 00 00 fd 00 30 78 87
 |..X..........0x.|
00000050  87 3c 01 0a 20 20 20 20  20 20 00 00 00 fe 00 43  |.<..
 .....C|
00000060  53 4f 54 33 0a 20 20 20  20 20 20 20 00 00 00 fe  |SOT3.
....|
00000070  00 4d 4e 45 30 30 37 5a  41 31 2d 35 0a 20 01 98  |.MNE007ZA1-5.
..|
00000080  70 13 79 00 00 03 01 14  9a 0f 01 05 3f 0b 9f 00
 |p.y.........?...|
00000090  2f 00 1f 00 07 07 69 00  02 00 05 00 00 00 00 00
 |/.....i.........|
00000100  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 |................|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 |................|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 |................|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 |................|
00000150  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 |................|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 |................|
00000170  00 00 00 00 00 00 00 00  00 00 00 00 00 00 f0 98
 |................|

EDID Structure Analysis:

Base EDID Block (0x00-0x7F): Standard monitor information
  - Manufacturer: CSO (China Star Optoelectronics) - ID: 0x3674
  - Product Code: 0x095A (MNE007ZA1-5)
  - Panel Size: 34cm x 22cm (15.6" diagonal, 2880x1800)
  - Refresh Rate: 120Hz capability indicated

DisplayID Extension Block (0x80-0xFF): Extended capabilities
  - Extension Tag: 0x70 (DisplayID)
  - DisplayID Version: 1.3 (byte 0x81 = 0x13)
  - Checksum Issue: Last byte 0xF0 at offset 0xFF causes validation failure
  - Expected Checksum: Should be 0x08 for valid checksum (remainder = 0)
  - Actual Remainder: 248 (0xF8) causing kernel validation to fail

Key Hardware Identifiers:
- Manufacturer: CSO (China Star Optoelectronics)
- Model: T3 / MNE007ZA1-5
- Panel Technology: eDP (Embedded DisplayPort)
- Native Resolution: 2880x1800
- Supported Refresh Rates: 60Hz (accessible), 120Hz (blocked by validation)

Current Display Capabilities (Limited)
======================================

Available modes (stock kernel):
    2880x1800@60Hz, 1920x1200@60Hz, 1920x1080@60Hz, 1680x1050@60Hz,
    1600x1200@60Hz, 1440x900@60Hz, 1280x1024@60Hz, 1280x800@60Hz,
    1280x720@60Hz, 1024x768@60Hz, 800x600@60Hz, 640x480@60Hz

Missing: 120Hz modes that hardware supports

Impact Assessment
=================

User Impact:
- Cannot access advertised hardware capability (120Hz)
- Forced to use lower refresh rates despite hardware support
- Reduced user experience on high-end display hardware
- Must modify kernel source to access full functionality

Affected Code Path:
File: drivers/gpu/drm/drm_displayid.c
Function: validate_displayid() lines 27-45

The validation function calculates checksums and returns -EINVAL on
mismatch, completely blocking DisplayID processing.

Workaround Discovery
====================

As an experiment to understand the failure, I commented out the checksum
validation code. Results with validation bypassed:

- All DisplayID errors disappear from logs
- 120Hz mode becomes available: 2880x1800@120.000+vrr
- Variable refresh rate functionality works
- System stability maintained over extended usage
- No display artifacts or performance issues observed

Working configuration:
    Monitor [ eDP-1 ] ON
      2880x1800@120 [id: '2880x1800@120.000+vrr'] CURRENT
      2880x1800@60.001 [id: '2880x1800@60.001+vrr'] PREFERRED
      Variable Refresh Rate: Active
      All display modes: Functional

This demonstrates the hardware is fully capable when validation doesn't
block it.

Additional Context
==================

Manufacturing Variation: The panel appears to have a minor checksum error
in its DisplayID extension, but otherwise functions perfectly. This suggests
a manufacturing variation rather than a fundamental hardware defect.

Current Validation Behavior: The kernel code treats any DisplayID checksum
mismatch as fatal:
- Logs error message with remainder value
- Returns error code (-EINVAL)
- Blocks all DisplayID functionality
- No tolerance for minor variations

Hardware Vendor: China Star Optoelectronics (CSO) - major display panel
manufacturer

Questions for Community
=======================

This issue raises several questions about DisplayID validation approach:

1. Is this strict validation intentional for all hardware? What are the
   security or stability reasons for treating checksum errors as fatal?

2. Are minor checksum variations expected in real-world panels? Is this
   type of manufacturing variation common?

3. How should the kernel handle hardware with minor EDID/DisplayID issues?
   Are there existing mechanisms for such compatibility cases?

4. What would be the preferred approach for handling this type of
   compatibility issue? Are there existing precedents or guidelines?

5. Are other users experiencing similar DisplayID validation failures?
   Is this an isolated case or part of a broader pattern?

Request for Guidance
====================

I'm seeking community input on the appropriate handling of this
compatibility issue. The hardware demonstrably works when validation is
relaxed, but current code blocks access to legitimate capabilities due to
minor checksum mismatch.

This feels like a case where strict validation is preventing access to
legitimate hardware capabilities. The panel clearly supports 120Hz (as
evidenced by it working perfectly when validation is bypassed), but the
current code path blocks this due to what appears to be a minor
manufacturing variation.

I'm happy to:
- Provide more detailed technical information if helpful
- Test any proposed solutions or patches
- Help gather data about similar issues if others are seeing them
- Share my complete system configuration and EDID data

What do you think? Is this something worth addressing, and if so, what
would be the best approach?

Thanks for your time and for maintaining this subsystem!

Best regards,
Tiago Araújo

---
System: Fedora Linux 42, Linux 6.17.0-rc5
Hardware: AMD Ryzen 7 7840HS with CSO T3 eDP panel
Current workaround: DisplayID validation bypassed, 120Hz working perfectly

Reply via email to