oscerd opened a new pull request, #23083: URL: https://github.com/apache/camel/pull/23083
## Summary **Final AWS batch** of span decorators for `camel-telemetry`. Closes out AWS coverage for [CAMEL-23387](https://issues.apache.org/jira/browse/CAMEL-23387) by adding decorators for the **AI/ML** group: text-to-speech (Polly), image AI (Rekognition), OCR (Textract), speech-to-text (Transcribe), translation (Translate), NLP (Comprehend) and vector search (S3 Vectors). After this PR, all 36 AWS components in `components/camel-aws/` that have a Camel scheme will have a corresponding `SpanDecorator`. The only remaining follow-up on CAMEL-23387 is the Google Cloud decorators mentioned in the original ticket, which is in scope for a separate JIRA. ## Changes New `SpanDecorator` implementations under `org.apache.camel.telemetry.decorators`: - **`AwsPollySpanDecorator`** (`aws2-polly`) — Text-to-speech. Tags: `operation`, `voiceId`, `outputFormat`, `engine`, `languageCode`. Lexicon content (PLS XML), the synthesized audio's S3 destination (bucket/key), the SNS topic ARN for notifications, and the `requestCharacters` response counter are not surfaced. - **`AwsRekognitionSpanDecorator`** (`aws2-rekognition`) — Image/video AI. Tags: `operation`, `collectionId`, `jobId`, `jobName`, `faceId`. Image data (binary), kms key id, large config objects (operations/output/human-loop config) and bulk facial-attribute / feature collections are not surfaced. - **`AwsTextractSpanDecorator`** (`aws2-textract`) — Document OCR. Tags: `operation`, `s3Bucket`, `s3Object`, `jobId`. The S3 object version, pagination tokens and feature-type collection are not surfaced. - **`AwsTranscribeSpanDecorator`** (`aws2-transcribe`) — Speech-to-text. Tags: `transcriptionJobName`, `languageCode`, `mediaFormat`, `mediaUri`. The `Transcribe2Constants` interface does not define an `OPERATION` header — operations are configured via the URI — so no `operation` tag is emitted (the span name from the URI already conveys the action). Vocabulary phrase lists, tag maps and the resource ARN are not surfaced. - **`AwsTranslateSpanDecorator`** (`aws2-translate`) — Translation. Tags: `operation`, `sourceLanguage`, `targetLanguage`. Custom-terminology name collections are not surfaced. - **`AwsComprehendSpanDecorator`** (`aws2-comprehend`) — NLP. Tags: `operation`, `languageCode`, `endpointArn` (custom-classifier endpoint ARN, an input identifier). Detection results (detected language, sentiment, scores) live on the OUT message and are not visible in `beforeTracingEvent`, so they are not surfaced. - **`AwsS3VectorsSpanDecorator`** (`aws2-s3-vectors`) — Vector search. Tags: `operation`, `vectorBucketName`, `vectorIndexName`, `vectorId`. Vector embedding data and query vectors (floats), metadata maps, similarity thresholds, distance metrics and response payloads (similarity scores, result counts, index status, bucket ARN) are not surfaced. All seven decorators extend `AbstractSpanDecorator` (these are producer-only or producer+polling-consumer components without messaging-style ordering semantics) and are registered alphabetically in `META-INF/services/org.apache.camel.telemetry.SpanDecorator`. Unit tests cover header-to-tag extraction for each decorator. Header constants are mirrored from each component's `*Constants` interface (with a Javadoc reference back to the source), matching the convention used by previous batches and `AzureServiceBusSpanDecorator`. This avoids creating hard dependencies from `camel-telemetry` to the AWS component modules. ### Tag selection rationale Same two rules applied across batches 3 through 6: 1. **Never emit values that may contain secrets, large payloads or PII** — image bytes, audio bytes, vector embeddings, lexicon content, vocabulary phrases, encrypted vector metadata. 2. **Prefer the request _target_ over the response payload** — `voiceId`, `s3Bucket/s3Object`, `transcriptionJobName`, `vectorIndexName`, `collectionId` etc. Response data (detected sentiment in Comprehend, similarity scores in S3 Vectors, request character counts in Polly) is response-shaped and not visible in `beforeTracingEvent`. In addition to the two rules above, this batch follows the IAM-principal-minimization principle established in earlier review fixes (KMS `keyId`, CloudTrail `username`, IAM `userName`, EKS `roleArn`): no `userId` from Rekognition collections, no `kmsKeyId` from Rekognition, no `resourceArn` from Transcribe. ## Test plan - [x] `mvn test` in `components/camel-telemetry` passes (133 tests, including 44 AWS decorator tests covering 36 components total — all AWS coverage on CAMEL-23387) - [x] Module-specific build (`mvn -DskipTests install`) succeeds - [x] No code style or formatter changes required ## Coverage on CAMEL-23387 (AWS — complete after this PR) | Batch | PR | Components | |---|---|---| | 1 | #23038 (merged) | SQS, SNS, Kinesis, S3 | | 2 | #23040 (merged) | DDB, DDB Streams, Lambda, EventBridge, SES, MQ, Kinesis Firehose, Bedrock | | 3 | #23045 (merged) | Athena, CloudWatch, KMS, MSK, Step Functions, Timestream, Redshift Data, CloudTrail | | 4 | #23077 (merged) | STS, IAM, Secrets Manager, Parameter Store, Security Hub, Config | | 5 | #23081 (merged) | EC2, ECS, EKS | | 6 | this PR | Polly, Rekognition, Textract, Transcribe, Translate, Comprehend, S3 Vectors | ## Follow-ups still pending - **Google Cloud decorators** mentioned in CAMEL-23387's description — separate scope, separate JIRA. Not in scope for this PR. - **Note on `aws-xray`**: the original "Compute & Tracing" follow-up listed in earlier PRs mentioned `camel-aws-xray`, but that module was **deprecated and removed** in commit ba9f8c5340a — it was a tracer integration with its own `SegmentDecorator` system, not a producer-style component, and there is nothing to add a `SpanDecorator` for. After this PR merges the AWS portion of CAMEL-23387 is fully complete and the JIRA can be closed pending the Google Cloud follow-up decision. --- _Claude Code on behalf of Andrea Cosentino_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
