[
https://issues.apache.org/jira/browse/TIKA-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048177#comment-18048177
]
ASF GitHub Bot commented on TIKA-4604:
--------------------------------------
nddipiazza opened a new pull request, #2502:
URL: https://github.com/apache/tika/pull/2502
## JIRA Ticket
https://issues.apache.org/jira/browse/TIKA-4604
## Summary
Adds a new Atlassian JWT fetcher plugin for Apache Tika pipes that enables
fetching content from Atlassian products (Confluence, Jira) using JWT
authentication.
## Changes
- **Created plugin module**: `tika-pipes-atlassian-jwt` under
`tika-pipes/tika-pipes-plugins/`
- **Implemented classes**:
- `AtlassianJwtFetcher` - Main fetcher with JWT authentication
- `AtlassianJwtFetcherFactory` - Factory for creating fetcher instances
- `AtlassianJwtFetcherConfig` - Configuration class with JSON support
- `AtlassianJwtGenerator` - JWT token generation utility
- `AtlassianJwtPipesPlugin` - PF4J plugin wrapper
- **Configuration files**:
- `pom.xml` with dependencies (nimbus-jose-jwt, guava, jackson)
- `plugin.properties` for PF4J
- `assembly.xml` for ZIP packaging
- **Updated parent `pom.xml`** to include new module
## Architecture
The implementation follows Apache Tika's plugin pattern:
- Extends `AbstractTikaExtension`
- Uses `ExtensionConfig` for JSON configuration
- Implements `Fetcher` interface with `Metadata` parameters
- Static `build()` method for instantiation
- Proper initialization pattern
## Configuration Example
```json
{
"fetchers": {
"atlassian-jwt-fetcher": {
"my-confluence": {
"sharedSecret": "your-shared-secret",
"issuer": "your-app-key",
"subject": "[email protected]",
"jwtExpiresInSeconds": 3600,
"connectTimeout": 30000,
"socketTimeout": 120000
}
}
}
}
```
## Testing
✅ Code compiles successfully:
```bash
mvn clean compile -DskipTests -pl
tika-pipes/tika-pipes-plugins/tika-pipes-atlassian-jwt -am
```
✅ Code formatted with spotless
## Source
Ported from:
https://github.com/nddipiazza/tika-pipes/tree/main/tika-pipes-fetchers/tika-fetcher-atlassian-jwt
> Add Atlassian fetcher plugin with JWT authentication
> ----------------------------------------------------
>
> Key: TIKA-4604
> URL: https://issues.apache.org/jira/browse/TIKA-4604
> Project: Tika
> Issue Type: New Feature
> Reporter: Nicholas DiPiazza
> Assignee: Nicholas DiPiazza
> Priority: Major
>
> h2. Overview
> Port the Atlassian fetcher from the external tika-pipes repository as a new
> Tika plugin. This fetcher enables fetching content from Atlassian products
> (Confluence, Jira) using JWT authentication.
> h2. Implementation Details
> * Port code from:
> https://github.com/nddipiazza/tika-pipes/tree/main/tika-pipes-fetchers/tika-fetcher-atlassian-jwt
> * Create new plugin module:
> *tika-pipes/tika-pipes-plugins/tika-fetcher-atlassian-jwt*
> * Implement as a standard Tika pipes plugin (following plugin architecture)
> * Support JWT authentication for Atlassian products
> * Include appropriate dependencies and configuration
> h2. Features
> * Fetch content from Confluence spaces and pages
> * Fetch content from Jira issues and projects
> * JWT token-based authentication
> * Configurable endpoint URLs
> * Error handling and retry logic
> h2. Acceptance Criteria
> * Atlassian fetcher integrated as a Tika plugin
> * Plugin follows standard Tika plugin architecture
> * Unit tests added for fetcher functionality
> * Documentation updated with usage examples
> * All existing tests pass
> * Plugin can be loaded dynamically by tika-grpc
> h2. Reference
> * External implementation:
> https://github.com/nddipiazza/tika-pipes/tree/main/tika-pipes-fetchers/tika-fetcher-atlassian-jwt
> * Fetchers README:
> https://github.com/nddipiazza/tika-pipes/blob/main/tika-pipes-fetchers/README.md
--
This message was sent by Atlassian Jira
(v8.20.10#820010)