[ 
https://issues.apache.org/jira/browse/TIKA-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047898#comment-18047898
 ] 

Nicholas DiPiazza commented on TIKA-4595:
-----------------------------------------

Implementation complete and PR created: https://github.com/apache/tika/pull/2488

h3. What was implemented:

*Socket Protocol Extensions:*
* Added 8 new commands to PipesClient.COMMANDS enum
* SAVE_FETCHER, DELETE_FETCHER, LIST_FETCHERS, GET_FETCHER  
* SAVE_EMITTER, DELETE_EMITTER, LIST_EMITTERS, GET_EMITTER

*PipesClient Public API:*
* saveFetcher(ExtensionConfig) - Create or update fetcher configuration
* deleteFetcher(String) - Remove a fetcher by ID
* listFetchers() - Get all available fetcher IDs
* getFetcherConfig(String) - Retrieve specific fetcher configuration
* Same methods for emitters (saveEmitter, deleteEmitter, listEmitters, 
getEmitterConfig)

*PipesServer Request Handlers:*
* Implemented handlers for all 8 commands
* Proper error handling with detailed error messages
* Serialization/deserialization of ExtensionConfig objects

*Core Infrastructure:*
* Added deleteComponent() to AbstractComponentManager
* Added getComponentConfig() to AbstractComponentManager
* Added wrapper methods to FetcherManager (deleteFetcher, getConfig)
* Added wrapper methods to EmitterManager (deleteEmitter, getConfig)
* Added remove() method to ConfigStore interface
* Updated InMemoryConfigStore and LoggingConfigStore implementations

h3. Usage Example:

{code:java}
PipesClient client = new PipesClient(pipesConfig, tikaConfigPath);

// Create a new S3 fetcher dynamically
ExtensionConfig s3Config = new ExtensionConfig(
    "my-s3-fetcher",        // ID
    "s3-fetcher",           // Plugin type
    "{"bucket": "my-bucket", "region": "us-east-1"}"
);
client.saveFetcher(s3Config);

// List all fetchers (static from config + dynamic)
Set<String> fetchers = client.listFetchers();

// Delete when no longer needed
client.deleteFetcher("my-s3-fetcher");
{code}

h3. Testing:
* All existing tests pass
* No breaking changes
* Maintains backward compatibility with static configuration

Ready for review and merge!


> Add dynamic fetcher management API to PipesClient
> -------------------------------------------------
>
>                 Key: TIKA-4595
>                 URL: https://issues.apache.org/jira/browse/TIKA-4595
>             Project: Tika
>          Issue Type: New Feature
>          Components: tika-pipes
>            Reporter: Nicholas DiPiazza
>            Priority: Major
>
> h2. Overview
> Add API to PipesClient for dynamically creating, updating, and deleting 
> fetchers at runtime through PipesServer's ConfigStore.
> h2. Current State
> * PipesServer already has ConfigStore infrastructure
> * FetcherManager and EmitterManager support runtime modifications
> * But PipesClient has no API to expose these capabilities to users
> h2. Desired Architecture
> {noformat}
> PipesClient API
>     ↓
> PipesServer (forked process)
>     ↓
> ConfigStore (memory, Ignite, etc.)
> {noformat}
> h2. Requirements
> # PipesClient provides public API for fetcher CRUD operations
> # All operations are sent to PipesServer via socket protocol  
> # PipesServer handles requests and updates ConfigStore
> # Static fetchers from tika-config.xml/json loaded at startup
> # Dynamic fetchers managed through ConfigStore
> # Both static and dynamic fetchers available for use
> h2. Benefits
> * Users can add/modify fetchers without restarting
> * Supports multi-tenant scenarios with isolated fetcher configs
> * Enables programmatic fetcher configuration
> * Maintains backwards compatibility with static config
> h2. Implementation Tasks
> See linked sub-tasks for detailed implementation steps.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to