reckart opened a new issue, #444: URL: https://github.com/apache/uima-uimaj/issues/444
**Is your feature request related to a problem? Please describe.** When inspecting or diffing CAS contents for tests we frequently rely on a simple CSV stringification that: - does not preserve rich, human-friendly output (HTML) for easier visual inspection, - lacks configurable columns (anchor, covered text, indexed status), - produces unstable ordering for multi-valued/annotation features and ambiguous anchors, - offers no convenient way to exclude noisy features/types or treat empty strings specially, - and forces long covered text into the output making diffs noisy. I'm often frustrated when test failures produce long, hard‑to‑scan CAS dumps or when small, irrelevant differences (e.g., non-deterministic anchor numbering or list order) make comparisons brittle. **Describe the solution you'd like** Add an enhanced CAS -> comparable text utility with the following capabilities: - Output formats: Keep CSV but add an HTML renderer for nicer human-readable tables. - Configurable columns: enable/disable an anchor column, an indexed column, and a covered‑text column (with configurable max length and middle-abbreviation). - Anchor formatting: anchors include type short name, optional annotation offsets, optional sofa id, optional indexing marker, and stable disambiguation suffixes for duplicate anchors; support optional anchor feature hash suffix. - Stable ordering: when multi‑valued features hold annotations, optionally sort them by begin (asc), end (desc), type name to provide deterministic set‑like ordering. - Index awareness: mark FSs as indexed and optionally add a dedicated `<INDEXED>` column; use indexed status as a tie-breaker when ordering. - Exclusions: allow regex patterns to exclude specific features or types from rendering (cache regex compilation for performance). - Null/empty handling: configurable `nullValue`, and an option to treat empty strings as null so empty values don’t clutter diffs. - Multi‑valued rendering: robust handling of array/list FSs and primitive arrays, rendering them as bracketed lists; handle nested multi-valued structures recursively. - Rendering options: omit XML declaration in HTML output and minimal inline styling so HTML is self-contained. - Public API knobs: setters/getters for all above flags so callers can tune output for different use cases (compact machine diffs vs human inspection). This produces a single stable, configurable comparable representation useful for both automated assertions and human debugging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
