[ 
https://issues.apache.org/jira/browse/TIKA-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048175#comment-18048175
 ] 

Nicholas DiPiazza commented on TIKA-4604:
-----------------------------------------

Work-in-progress commit pushed to branch TIKA-4604-atlassian-fetcher:

h3. Completed:
* Created plugin directory structure
* Added pom.xml with all dependencies
* Created plugin.properties and assembly.xml
* Implemented AtlassianJwtPipesPlugin
* Implemented AtlassianJwtFetcherFactory
* Refactored AtlassianJwtFetcherConfig to Apache Tika pattern
* Updated parent pom.xml to include new module
* Fixed checkstyle issues

h3. Remaining Work:
The main AtlassianJwtFetcher class needs significant refactoring to adapt from 
the external architecture to Apache Tika's pattern:

* Extend AbstractTikaExtension
* Add static build() method
* Change fetch() signature to use Metadata instead of Maps
* Update all method signatures throughout
* Add initialize() method
* Remove old plugin architecture references

Estimated time to complete: 3-4 hours of focused refactoring work

Branch: https://github.com/apache/tika/tree/TIKA-4604-atlassian-fetcher

See detailed completion guide in commit message for step-by-step instructions.

> Add Atlassian fetcher plugin with JWT authentication
> ----------------------------------------------------
>
>                 Key: TIKA-4604
>                 URL: https://issues.apache.org/jira/browse/TIKA-4604
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: Nicholas DiPiazza
>            Assignee: Nicholas DiPiazza
>            Priority: Major
>
> h2. Overview
> Port the Atlassian fetcher from the external tika-pipes repository as a new 
> Tika plugin. This fetcher enables fetching content from Atlassian products 
> (Confluence, Jira) using JWT authentication.
> h2. Implementation Details
> * Port code from: 
> https://github.com/nddipiazza/tika-pipes/tree/main/tika-pipes-fetchers/tika-fetcher-atlassian-jwt
> * Create new plugin module: 
> *tika-pipes/tika-pipes-plugins/tika-fetcher-atlassian-jwt*
> * Implement as a standard Tika pipes plugin (following plugin architecture)
> * Support JWT authentication for Atlassian products
> * Include appropriate dependencies and configuration
> h2. Features
> * Fetch content from Confluence spaces and pages
> * Fetch content from Jira issues and projects
> * JWT token-based authentication
> * Configurable endpoint URLs
> * Error handling and retry logic
> h2. Acceptance Criteria
> * Atlassian fetcher integrated as a Tika plugin
> * Plugin follows standard Tika plugin architecture
> * Unit tests added for fetcher functionality
> * Documentation updated with usage examples
> * All existing tests pass
> * Plugin can be loaded dynamically by tika-grpc
> h2. Reference
> * External implementation: 
> https://github.com/nddipiazza/tika-pipes/tree/main/tika-pipes-fetchers/tika-fetcher-atlassian-jwt
> * Fetchers README: 
> https://github.com/nddipiazza/tika-pipes/blob/main/tika-pipes-fetchers/README.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to