[
https://issues.apache.org/jira/browse/TIKA-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048175#comment-18048175
]
Nicholas DiPiazza commented on TIKA-4604:
-----------------------------------------
Work-in-progress commit pushed to branch TIKA-4604-atlassian-fetcher:
h3. Completed:
* Created plugin directory structure
* Added pom.xml with all dependencies
* Created plugin.properties and assembly.xml
* Implemented AtlassianJwtPipesPlugin
* Implemented AtlassianJwtFetcherFactory
* Refactored AtlassianJwtFetcherConfig to Apache Tika pattern
* Updated parent pom.xml to include new module
* Fixed checkstyle issues
h3. Remaining Work:
The main AtlassianJwtFetcher class needs significant refactoring to adapt from
the external architecture to Apache Tika's pattern:
* Extend AbstractTikaExtension
* Add static build() method
* Change fetch() signature to use Metadata instead of Maps
* Update all method signatures throughout
* Add initialize() method
* Remove old plugin architecture references
Estimated time to complete: 3-4 hours of focused refactoring work
Branch: https://github.com/apache/tika/tree/TIKA-4604-atlassian-fetcher
See detailed completion guide in commit message for step-by-step instructions.
> Add Atlassian fetcher plugin with JWT authentication
> ----------------------------------------------------
>
> Key: TIKA-4604
> URL: https://issues.apache.org/jira/browse/TIKA-4604
> Project: Tika
> Issue Type: New Feature
> Reporter: Nicholas DiPiazza
> Assignee: Nicholas DiPiazza
> Priority: Major
>
> h2. Overview
> Port the Atlassian fetcher from the external tika-pipes repository as a new
> Tika plugin. This fetcher enables fetching content from Atlassian products
> (Confluence, Jira) using JWT authentication.
> h2. Implementation Details
> * Port code from:
> https://github.com/nddipiazza/tika-pipes/tree/main/tika-pipes-fetchers/tika-fetcher-atlassian-jwt
> * Create new plugin module:
> *tika-pipes/tika-pipes-plugins/tika-fetcher-atlassian-jwt*
> * Implement as a standard Tika pipes plugin (following plugin architecture)
> * Support JWT authentication for Atlassian products
> * Include appropriate dependencies and configuration
> h2. Features
> * Fetch content from Confluence spaces and pages
> * Fetch content from Jira issues and projects
> * JWT token-based authentication
> * Configurable endpoint URLs
> * Error handling and retry logic
> h2. Acceptance Criteria
> * Atlassian fetcher integrated as a Tika plugin
> * Plugin follows standard Tika plugin architecture
> * Unit tests added for fetcher functionality
> * Documentation updated with usage examples
> * All existing tests pass
> * Plugin can be loaded dynamically by tika-grpc
> h2. Reference
> * External implementation:
> https://github.com/nddipiazza/tika-pipes/tree/main/tika-pipes-fetchers/tika-fetcher-atlassian-jwt
> * Fetchers README:
> https://github.com/nddipiazza/tika-pipes/blob/main/tika-pipes-fetchers/README.md
--
This message was sent by Atlassian Jira
(v8.20.10#820010)