This is an automated email from the ASF dual-hosted git repository.
ndipiazza pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git
The following commit(s) were added to refs/heads/main by this push:
new 4f25bb083 TIKA-4583: Add Apache Ignite ConfigStore implementation
(#2470)
4f25bb083 is described below
commit 4f25bb08389a520f23f24bbe546ac58380a16975
Author: Nicholas DiPiazza <[email protected]>
AuthorDate: Fri Dec 26 12:00:09 2025 -0600
TIKA-4583: Add Apache Ignite ConfigStore implementation (#2470)
* TIKA-4583: Add Apache Ignite ConfigStore implementation
- Added init() method to ConfigStore interface for initialization support
- Created new Maven sub-module: tika-ignite-config-store
- Implemented IgniteConfigStore using Apache Ignite distributed cache
- Provides distributed configuration storage for Tika Pipes clustering
- Supports REPLICATED and PARTITIONED cache modes
- Thread-safe implementation with comprehensive error handling
- Added test suite for IgniteConfigStore
- Updated parent pom.xml to include new module
- Added comprehensive README with usage examples
* TIKA-4583: Call init() on ConfigStore in AbstractComponentManager
- Added init() call in AbstractComponentManager constructor
- Ensures ConfigStore is properly initialized before use
- Wraps initialization exception in RuntimeException for clarity
* TIKA-4583: Add configurable ConfigStore support to tika-grpc
- Added configStoreType field to PipesConfig
- Created ConfigStoreFactory to instantiate ConfigStore by type
- Updated TikaGrpcServerImpl to use ConfigStoreFactory
- Added tika-ignite-config-store as optional dependency to tika-grpc
- Created sample configuration showing Ignite usage
- Updated README with distributed configuration documentation
Allows users to toggle between in-memory and Ignite-based distributed
configuration storage by setting configStoreType in tika config:
- 'memory' (default): local in-memory storage
- 'ignite': Apache Ignite distributed cache for clustering
- fully qualified class name: custom ConfigStore implementation
* TIKA-4583: Refactor ConfigStore to use PF4J plugin system
- ConfigStore now extends TikaExtension interface
- ConfigStoreFactory converted to PF4J-based factory interface
- Created IgniteConfigStoreFactory with @Extension annotation
- IgniteConfigStore now loaded via plugin discovery
- Updated InMemoryConfigStore and LoggingConfigStore with
getExtensionConfig()
- TikaGrpcServerImpl now uses plugin manager to load ConfigStore
Benefits:
- Proper plugin architecture following Tika patterns
- ConfigStore implementations auto-discovered via PF4J
- No hard-coded class names or reflection needed
- Consistent with Fetcher/Emitter factory pattern
* TIKA-4583: Implement JSON configuration parsing for IgniteConfigStore
- Created IgniteConfigStoreConfig class following HttpFetcherConfig pattern
- Parses JSON from ExtensionConfig to configure cache settings
- Added configStoreParams field to PipesConfig
- Updated TikaGrpcServerImpl to pass params to factory
- Removed TODO comment - configuration now fully implemented
Configuration options supported:
- cacheName: Name of the Ignite cache
- cacheMode: REPLICATED or PARTITIONED
- igniteInstanceName: Name of Ignite instance
- autoClose: Whether to auto-close on shutdown
Example configuration:
{
"pipes": {
"configStoreType": "ignite",
"configStoreParams": {
"cacheName": "my-cache",
"cacheMode": "REPLICATED"
}
}
}
* TIKA-4583: Update documentation with JSON configuration examples
- Added JSON configuration examples to README
- Documented all configStoreParams options
- Clarified difference between JSON and Java API usage
- Shows complete example with cache mode and instance name
* TIKA-4583: Add comprehensive Kubernetes deployment documentation
Added detailed guide for deploying tika-grpc with Ignite clustering on
Kubernetes:
- Ignite XML configuration with Kubernetes IP finder
- Complete RBAC setup (ServiceAccount, Role, RoleBinding)
- Headless service for pod discovery
- LoadBalancer service for external access
- StatefulSet with proper health checks and resource limits
- ConfigMap for Tika configuration
- Dockerfile example with Ignite plugin
- Troubleshooting guide for common issues
- Network policy considerations
- Pod anti-affinity recommendations
- Monitoring and verification steps
The guide ensures graceful pod-to-pod communication using:
- TcpDiscoveryKubernetesIpFinder for discovery
- Headless service for stable network identities
- Proper RBAC permissions for pod discovery
- StatefulSet for stable pod names and ordering
* TIKA-4583: Add PF4J plugin.properties and Plugin class
- Created IgniteConfigStorePlugin extending org.pf4j.Plugin
- Added plugin.properties with plugin metadata
- Added pf4j dependency to pom.xml
- Plugin now properly discoverable by PF4J plugin manager
Plugin metadata:
- plugin.id: tika-ignite-config-store-plugin
- plugin.class: org.apache.tika.pipes.plugin.ignite.IgniteConfigStorePlugin
- plugin.version: 4.0.0-SNAPSHOT
- plugin.provider: Apache Tika
This enables proper plugin discovery and lifecycle management
through the PF4J framework, consistent with other Tika plugins.
* TIKA-4583: Remove unnecessary Plugin class - use PF4J default
The plugin.class property is optional in PF4J. When not specified,
PF4J uses org.pf4j.Plugin as a default wrapper.
Since we don't need custom plugin lifecycle logic (start/stop/delete),
we can simplify by removing IgniteConfigStorePlugin and only keeping:
- plugin.properties (required for plugin metadata)
- @Extension annotation on IgniteConfigStoreFactory (for discovery)
This is cleaner and reduces boilerplate code while maintaining
full functionality.
* TIKA-4583: Introduce Ignite-based ConfigStore with JSON configuration
support
* TIKA-4583: Introduce Ignite-based ConfigStore with JSON configuration
support
* TIKA-4583: Introduce Ignite-based ConfigStore with JSON configuration
support
* TIKA-4583: Introduce Ignite-based ConfigStore with JSON configuration
support
* TIKA-4583: Introduce Ignite-based ConfigStore with JSON configuration
support
* TIKA-4583: Introduce Ignite-based ConfigStore with JSON configuration
support
* TIKA-4583: Introduce Ignite-based ConfigStore with JSON configuration
support
* TIKA-4587: Add pf4j development mode support to TikaPluginManager
Enables plugin development without packaging as ZIP files by supporting
pf4j's DEVELOPMENT runtime mode. This allows developers to point
plugin-roots directly at unpackaged plugin directories (e.g.,
target/classes)
for faster iteration during development.
Features:
- Configure via system property: -Dtika.plugin.dev.mode=true
- Configure via environment variable: TIKA_PLUGIN_DEV_MODE=true
- Skips ZIP extraction when in development mode
- Logs mode on startup for visibility
- Defaults to DEPLOYMENT mode for backward compatibility
This aligns with pf4j best practices documented at:
https://pf4j.org/doc/development-mode.html
JIRA: https://issues.apache.org/jira/browse/TIKA-4587
* TIKA-4587: Add development mode documentation to tika-grpc README
Adds comprehensive guide on using pf4j development mode for plugin
development:
- How to enable development mode (system property or env var)
- Configuration examples with plugin-roots pointing to target/classes
- Complete development workflow
- Multiple plugins example
- Troubleshooting tips
- Switching between dev and production modes
* TIKA-4587: Add IntelliJ IDEA setup guide for loading all plugins in dev
mode
Adds comprehensive IntelliJ configuration example:
- Complete dev-config.json with ALL plugin class directories
- Step-by-step IntelliJ Run Configuration setup
- VM options and environment variable configuration
- Hot reload workflow for fast iteration
- Shell script to auto-generate config with all plugins
* TIKA-4587: Clarify that plugin-roots is in Tika config JSON
Added explicit note that plugin-roots is configured in the Tika
configuration JSON file (e.g., tika-config.json, dev-config.json)
* Add Distributed Configuration with Apache Ignite section to tika-grpc
README
Restored documentation about:
- Ignite-based distributed configuration storage
- ConfigStore types (memory, ignite, custom)
- Maven dependency for tika-ignite-config-store
- Running with Ignite
- Cluster behavior and configuration sharing
* TIKA-4587: Fix development mode to prevent scanning subdirectories
Override createPluginRepository() to prevent pf4j from scanning
subdirectories
in development mode. In dev mode, each path in plugin-roots should be
treated
as a complete plugin directory (target/classes), not as a container of
plugins.
This fixes the error:
'No PluginDescriptorFinder for plugin .../target/classes/org'
The default DevelopmentPluginRepository scans for subdirectories, but we
want
each configured path to be the plugin root itself.
* TIKA-4587: Fix ignite plugin compilation and improve dev mode
- Fixed ignite plugin compilation by ensuring tika-pipes-core is built first
- Set pf4j.mode system property in TikaPluginManager.load() to properly
enable pf4j's DEVELOPMENT mode
- Override createPluginDescriptorFinder() to use
PropertiesPluginDescriptorFinder
in development mode (looks for plugin.properties instead of MANIFEST.MF)
- Add Maven dev profile in tika-grpc for easy development server startup
- Add dev-config.json with all 13 plugin directories
Maven dev profile usage:
cd tika-pipes/tika-pipes-plugins && mvn compile
cd ../../tika-grpc && mvn compile exec:java -Pdev
Successfully loads plugins from target/classes in development mode!
* TIKA-4587: Add development mode support for tika-grpc with plugin
hot-reloading
- Add dev profile to tika-grpc/pom.xml with plugin dependencies
- Include all plugin modules in dev profile to make classes available at
runtime
- Set tika.plugin.dev.mode=true system property to enable PF4J development
mode
- Add run-dev.sh convenience script for starting server in dev mode
- Rename test-config.json to test-dev-config.json for clarity
- Allows running tika-grpc with plugins loaded from target/classes
directories
- Enables rapid development without building plugin ZIPs
* Fix README: Remove invalid ${project.basedir} examples and use correct
relative paths
- Remove duplicate/confusing sections with Maven property placeholders
- Update examples to use relative paths (../tika-pipes/...) that actually
work
- Update run-dev.sh to reference correct config file (dev-tika-config.json)
- Simplify documentation and remove redundant content
- Add note that paths are relative to tika-grpc directory
* Expand IntelliJ development guide in README
- Add detailed step-by-step IntelliJ IDEA setup instructions
- Document complete development workflow for making plugin changes
- Explain why server restart is required (PF4J loads plugins at startup)
- Add tips for faster development (build shortcuts, debug mode, keyboard
shortcuts)
- Include common issues and solutions section
- Add alternative Maven exec approach
- Clarify fast iteration cycle: ~20-25 seconds vs 2-3 minutes for ZIP
packaging
* Update TikaGrpcServer run configuration to use development config file
* Remove commented code for properties-based plugin descriptor finder in
development mode
* Remove unused path for Ignite plugin in development configuration, not
yet merged
* Remove unused path for Ignite plugin in development configuration, not
yet merged
* Fix TikaPluginManager to configure PF4J mode in constructor
- Move configurePf4jRuntimeMode() call to happen before super() via helper
method
- Ensures pf4j.mode system property is set based on tika.plugin.dev.mode
- Explicitly set deployment mode when dev mode is false to ensure clean
state
- Fixes testDevelopmentModeViaSystemProperty test failure
- Allows direct constructor usage (not just via load() methods) to work
correctly
* TIKA-4583 - Add Ignite configuration and dependencies for Tika pipes
* TIKA-4583 - Add Maven options for module accessibility in development mode
* TIKA-4583 - add a .gitignore for ignite
* TIKA-4583 - clean up some code comments and make them nicely named methods
* TIKA-4583: Add Docker and Kubernetes deployment documentation
- Add Docker Hub image instructions for apache/tika-grpc
- Add simple Kubernetes Deployment example
- Add StatefulSet deployment with Apache Ignite ConfigStore
- Document Ignite cluster configuration for pod-to-pod communication
- Add RBAC setup for Kubernetes API discovery
- Include HPA, monitoring, and troubleshooting sections
- Document required ports and environment variables for Ignite
* TIKA-4583: Apply Copilot review suggestions
1. Remove H2 version override - use parent POM version (2.4.240)
2. Remove SCM tag from ignite pom.xml
3. Simplify lambda using method reference in IgniteConfigStore.keySet()
4. Add closed flag to prevent reinitialization after close()
5. Fix tika-config-ignite.json to use string format for configStoreParams
6. Add validation for invalid cache modes with proper error messages
7. Fix ClassCastException in ConfigStoreFactory - try constructor with
ExtensionConfig first
Co-authored-by: copilot-pull-request-reviewer
* TIKA-4583: Restore H2 1.4.197 version for Ignite compatibility
Apache Ignite 2.16.0 requires H2 1.4.x and is not compatible with H2 2.x.
The newer H2 2.4.240 removed classes like org.h2.value.ValueByte that
Ignite depends on, causing NoClassDefFoundError.
This reverts the H2 version override removal from the previous commit,
with a comment explaining why the override is necessary.
* TIKA-4583: Add README with H2 security disclaimer
Document the H2 1.4.197 dependency requirement and associated security
considerations. Explain why the older version is necessary (Ignite
compatibility) and provide risk mitigation context and alternatives
for security-sensitive environments.
---
.gitignore | 5 +-
tika-grpc/README.md | 380 +++++++++++++++++++++
tika-grpc/dev-tika-config.json | 5 +
tika-grpc/pom.xml | 30 +-
tika-grpc/run-dev.sh | 6 +
.../apache/tika/pipes/grpc/TikaGrpcServerImpl.java | 18 +-
.../src/test/resources/tika-config-ignite.json | 24 ++
tika-parent/pom.xml | 18 +-
.../tika/pipes/core/AbstractComponentManager.java | 5 +
.../org/apache/tika/pipes/core/PipesConfig.java | 28 ++
.../apache/tika/pipes/core/config/ConfigStore.java | 14 +-
.../tika/pipes/core/config/ConfigStoreFactory.java | 131 +++++++
.../pipes/core/config/InMemoryConfigStore.java | 10 +
.../tika/pipes/core/config/LoggingConfigStore.java | 6 +
tika-pipes/tika-pipes-plugins/pom.xml | 1 +
.../tika-pipes-plugins/tika-pipes-ignite/README.md | 68 ++++
.../tika-pipes-plugins/tika-pipes-ignite/pom.xml | 184 ++++++++++
.../src/main/assembly/assembly.xml | 55 +++
.../tika/pipes/ignite/ExtensionConfigDTO.java | 77 +++++
.../tika/pipes/ignite/IgniteConfigStore.java | 183 ++++++++++
.../pipes/ignite/IgniteConfigStoreFactory.java} | 48 ++-
.../ignite/config/IgniteConfigStoreConfig.java | 111 ++++++
.../pipes/plugin/ignite/IgnitePipesPlugin.java | 48 +++
.../src/main/resources/plugin.properties | 22 ++
.../tika/pipes/ignite/IgniteConfigStoreTest.java | 207 +++++++++++
25 files changed, 1645 insertions(+), 39 deletions(-)
diff --git a/.gitignore b/.gitignore
index 011a1f3a0..9b651f244 100644
--- a/.gitignore
+++ b/.gitignore
@@ -15,4 +15,7 @@ nb-configuration.xml
*.DS_Store
*.tmp-inception
*.snap
-.*.swp
\ No newline at end of file
+.*.swp
+
+tika-pipes/tika-pipes-plugins/tika-pipes-ignite/ignite
+tika-grpc/ignite
diff --git a/tika-grpc/README.md b/tika-grpc/README.md
index a4ab950e3..6ffa865bc 100644
--- a/tika-grpc/README.md
+++ b/tika-grpc/README.md
@@ -345,3 +345,383 @@ For production deployments, use packaged ZIP files:
- [pf4j Development Mode
Documentation](https://pf4j.org/doc/development-mode.html)
- [JIRA TIKA-4587](https://issues.apache.org/jira/browse/TIKA-4587) -
Development mode implementation
+
+## Docker and Kubernetes Deployment
+
+### Docker Image
+
+The official Tika gRPC Docker images are published to Docker Hub at
`apache/tika-grpc`.
+
+**Pull the latest image:**
+```bash
+docker pull apache/tika-grpc:latest
+```
+
+**Run with default configuration:**
+```bash
+docker run -p 50052:50052 apache/tika-grpc:latest
+```
+
+**Run with custom configuration:**
+```bash
+docker run -p 50052:50052 \
+ -v $(pwd)/my-config.json:/config/tika-config.json \
+ apache/tika-grpc:latest --config /config/tika-config.json
+```
+
+### Kubernetes Deployment
+
+#### Single Instance Deployment
+
+For simple deployments without distributed configuration:
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: tika-grpc
+spec:
+ replicas: 3
+ selector:
+ matchLabels:
+ app: tika-grpc
+ template:
+ metadata:
+ labels:
+ app: tika-grpc
+ spec:
+ containers:
+ - name: tika-grpc
+ image: apache/tika-grpc:latest
+ ports:
+ - containerPort: 50052
+ name: grpc
+ resources:
+ requests:
+ memory: "2Gi"
+ cpu: "1000m"
+ limits:
+ memory: "4Gi"
+ cpu: "2000m"
+ volumeMounts:
+ - name: config
+ mountPath: /config
+ volumes:
+ - name: config
+ configMap:
+ name: tika-grpc-config
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: tika-grpc
+spec:
+ selector:
+ app: tika-grpc
+ ports:
+ - port: 50052
+ targetPort: 50052
+ name: grpc
+ type: ClusterIP
+```
+
+#### Deployment with Apache Ignite ConfigStore
+
+For distributed deployments with shared configuration using Apache Ignite:
+
+**1. Create ConfigMap with Tika configuration:**
+
+```yaml
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: tika-grpc-config
+data:
+ tika-config.json: |
+ {
+ "pipes": {
+ "configStoreType": "ignite",
+ "configStoreParams":
"{\"cacheName\":\"tika-config-cache\",\"cacheMode\":\"REPLICATED\",\"igniteInstanceName\":\"TikaCluster\"}"
+ },
+ "fetchers": [
+ {
+ "s3": {
+ "myS3Fetcher": {
+ "region": "us-east-1",
+ "bucket": "my-bucket"
+ }
+ }
+ }
+ ],
+ "emitters": [
+ {
+ "s3": {
+ "myS3Emitter": {
+ "region": "us-east-1",
+ "bucket": "my-output-bucket"
+ }
+ }
+ }
+ ]
+ }
+```
+
+**2. Create StatefulSet for Ignite cluster discovery:**
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+ name: tika-grpc-ignite
+ labels:
+ app: tika-grpc
+spec:
+ clusterIP: None # Headless service for Ignite discovery
+ selector:
+ app: tika-grpc
+ ports:
+ - port: 47100
+ name: ignite-comm
+ - port: 47500
+ name: ignite-disco
+ - port: 50052
+ name: grpc
+---
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+ name: tika-grpc
+spec:
+ serviceName: tika-grpc-ignite
+ replicas: 3
+ selector:
+ matchLabels:
+ app: tika-grpc
+ template:
+ metadata:
+ labels:
+ app: tika-grpc
+ spec:
+ containers:
+ - name: tika-grpc
+ image: apache/tika-grpc:latest
+ ports:
+ - containerPort: 50052
+ name: grpc
+ - containerPort: 47100
+ name: ignite-comm
+ - containerPort: 47500
+ name: ignite-disco
+ env:
+ - name: JAVA_OPTS
+ value: >-
+ -Xmx2g
+ -Xms2g
+ --add-opens=java.base/java.nio=ALL-UNNAMED
+ --add-opens=java.base/sun.nio.ch=ALL-UNNAMED
+ --add-opens=java.base/java.lang=ALL-UNNAMED
+ --add-opens=java.base/java.util=ALL-UNNAMED
+ --add-opens=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED
+ - name: IGNITE_KUBERNETES_NAMESPACE
+ valueFrom:
+ fieldRef:
+ fieldPath: metadata.namespace
+ - name: IGNITE_KUBERNETES_SERVICE_NAME
+ value: "tika-grpc-ignite"
+ resources:
+ requests:
+ memory: "2Gi"
+ cpu: "1000m"
+ limits:
+ memory: "4Gi"
+ cpu: "2000m"
+ volumeMounts:
+ - name: config
+ mountPath: /config
+ readinessProbe:
+ exec:
+ command:
+ - grpc_health_probe
+ - -addr=:50052
+ initialDelaySeconds: 10
+ periodSeconds: 5
+ livenessProbe:
+ exec:
+ command:
+ - grpc_health_probe
+ - -addr=:50052
+ initialDelaySeconds: 30
+ periodSeconds: 10
+ volumes:
+ - name: config
+ configMap:
+ name: tika-grpc-config
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: tika-grpc
+spec:
+ selector:
+ app: tika-grpc
+ ports:
+ - port: 50052
+ targetPort: 50052
+ name: grpc
+ type: LoadBalancer # Or ClusterIP for internal-only access
+```
+
+**3. RBAC permissions for Kubernetes API access (Ignite discovery):**
+
+```yaml
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+ name: tika-grpc
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+ name: tika-grpc
+rules:
+- apiGroups: [""]
+ resources: ["pods"]
+ verbs: ["get", "list"]
+- apiGroups: [""]
+ resources: ["endpoints"]
+ verbs: ["get", "list"]
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+ name: tika-grpc
+subjects:
+- kind: ServiceAccount
+ name: tika-grpc
+roleRef:
+ kind: Role
+ name: tika-grpc
+ apiGroup: rbac.authorization.k8s.io
+```
+
+Then update the StatefulSet to use the service account:
+
+```yaml
+spec:
+ template:
+ spec:
+ serviceAccountName: tika-grpc
+ containers:
+ # ... rest of container spec
+```
+
+### Ignite Cluster Configuration
+
+When using Apache Ignite ConfigStore in Kubernetes, the Ignite nodes
automatically discover each other using Kubernetes API. Here's how it works:
+
+1. **Headless Service** (`clusterIP: None`) allows each pod to have its own
DNS entry
+2. **StatefulSet** ensures predictable pod names: `tika-grpc-0`,
`tika-grpc-1`, `tika-grpc-2`
+3. **Environment Variables** tell Ignite how to discover other nodes:
+ - `IGNITE_KUBERNETES_NAMESPACE`: Current namespace
+ - `IGNITE_KUBERNETES_SERVICE_NAME`: Service name for discovery
+4. **RBAC** grants permission to query Kubernetes API for pod discovery
+
+**Key Ports:**
+- **50052**: gRPC API for Tika operations
+- **47100**: Ignite communication port
+- **47500**: Ignite discovery port
+
+**Benefits of Ignite ConfigStore in Kubernetes:**
+- **Shared Configuration**: All pods share fetcher/emitter configurations
+- **Configuration Updates**: Update config on one pod, all pods see the change
+- **High Availability**: Config survives pod restarts
+- **Automatic Discovery**: Pods automatically find each other
+
+### Horizontal Pod Autoscaling
+
+Scale based on CPU/memory or custom metrics:
+
+```yaml
+apiVersion: autoscaling/v2
+kind: HorizontalPodAutoscaler
+metadata:
+ name: tika-grpc
+spec:
+ scaleTargetRef:
+ apiVersion: apps/v1
+ kind: StatefulSet
+ name: tika-grpc
+ minReplicas: 3
+ maxReplicas: 10
+ metrics:
+ - type: Resource
+ resource:
+ name: cpu
+ target:
+ type: Utilization
+ averageUtilization: 70
+ - type: Resource
+ resource:
+ name: memory
+ target:
+ type: Utilization
+ averageUtilization: 80
+```
+
+### Monitoring and Observability
+
+**Prometheus metrics endpoint:**
+
+Add to the StatefulSet:
+```yaml
+annotations:
+ prometheus.io/scrape: "true"
+ prometheus.io/port: "8081"
+ prometheus.io/path: "/metrics"
+```
+
+**Logging:**
+
+Configure structured JSON logging for better observability:
+```yaml
+env:
+- name: LOG_LEVEL
+ value: "INFO"
+- name: LOG_FORMAT
+ value: "json"
+```
+
+### Best Practices
+
+1. **Resource Limits**: Always set memory/CPU requests and limits
+2. **Readiness/Liveness Probes**: Use gRPC health checks
+3. **Pod Disruption Budget**: Ensure high availability during updates
+4. **Network Policies**: Restrict access to gRPC and Ignite ports
+5. **Persistent Storage**: For file-based fetchers/emitters, use
PersistentVolumeClaims
+6. **Configuration Management**: Use ConfigMaps/Secrets for sensitive data
+7. **Ignite Memory**: Allocate sufficient heap for Ignite cache (typically
2-4GB)
+
+### Troubleshooting
+
+**Pods can't discover each other:**
+- Check RBAC permissions
+- Verify headless service exists
+- Check environment variables are set
+- Review Ignite logs: `kubectl logs tika-grpc-0 | grep Ignite`
+
+**Out of Memory errors:**
+- Increase JVM heap size via `JAVA_OPTS`
+- Increase pod memory limits
+- Adjust Ignite data region sizes
+
+**Slow gRPC responses:**
+- Scale horizontally with HPA
+- Check resource utilization
+- Review parser performance
+- Consider adding caching layer
+
+### References
+
+- [Apache Ignite Kubernetes
Documentation](https://ignite.apache.org/docs/latest/installation/kubernetes/amazon-eks-deployment)
+- [Kubernetes
StatefulSets](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)
+- [gRPC Health Checking
Protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md)
diff --git a/tika-grpc/dev-tika-config.json b/tika-grpc/dev-tika-config.json
index 107d7cb0f..6261687e1 100644
--- a/tika-grpc/dev-tika-config.json
+++ b/tika-grpc/dev-tika-config.json
@@ -5,6 +5,7 @@
"../tika-pipes/tika-pipes-plugins/tika-pipes-file-system/target/classes",
"../tika-pipes/tika-pipes-plugins/tika-pipes-gcs/target/classes",
"../tika-pipes/tika-pipes-plugins/tika-pipes-http/target/classes",
+ "../tika-pipes/tika-pipes-plugins/tika-pipes-ignite/target/classes",
"../tika-pipes/tika-pipes-plugins/tika-pipes-jdbc/target/classes",
"../tika-pipes/tika-pipes-plugins/tika-pipes-json/target/classes",
"../tika-pipes/tika-pipes-plugins/tika-pipes-kafka/target/classes",
@@ -13,6 +14,10 @@
"../tika-pipes/tika-pipes-plugins/tika-pipes-s3/target/classes",
"../tika-pipes/tika-pipes-plugins/tika-pipes-solr/target/classes"
],
+ "pipes": {
+ "configStoreType": "ignite",
+ "configStoreParams":
"{\"cacheName\":\"tika-config-cache\",\"cacheMode\":\"REPLICATED\",\"igniteInstanceName\":\"TikaGrpcIgnite\",\"autoClose\":true}"
+ },
"fetchers": [
{
"fs": {
diff --git a/tika-grpc/pom.xml b/tika-grpc/pom.xml
index d487c6acb..a30fb0c3f 100644
--- a/tika-grpc/pom.xml
+++ b/tika-grpc/pom.xml
@@ -228,6 +228,12 @@
<artifactId>tika-pipes-file-system</artifactId>
<version>${project.version}</version>
</dependency>
+ <dependency>
+ <groupId>org.apache.tika</groupId>
+ <artifactId>tika-pipes-ignite</artifactId>
+ <version>${project.version}</version>
+ <optional>true</optional>
+ </dependency>
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-pipes-file-system</artifactId>
@@ -483,7 +489,7 @@
</plugin>
</plugins>
</build>
-
+
<profiles>
<!-- Development profile for running tika-grpc with plugin development
mode -->
<profile>
@@ -530,6 +536,21 @@
<artifactId>tika-pipes-ignite</artifactId>
<version>${project.version}</version>
</dependency>
+ <dependency>
+ <groupId>org.apache.ignite</groupId>
+ <artifactId>ignite-core</artifactId>
+ <version>2.16.0</version>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.ignite</groupId>
+ <artifactId>ignite-indexing</artifactId>
+ <version>2.16.0</version>
+ </dependency>
+ <dependency>
+ <groupId>com.h2database</groupId>
+ <artifactId>h2</artifactId>
+ <version>1.4.197</version>
+ </dependency>
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-pipes-solr</artifactId>
@@ -569,6 +590,13 @@
<argument>--config</argument>
<argument>${config.file}</argument>
</arguments>
+ <additionalJvmArgs>
+
<additionalJvmArg>--add-opens=java.base/java.nio=ALL-UNNAMED</additionalJvmArg>
+
<additionalJvmArg>--add-opens=java.base/sun.nio.ch=ALL-UNNAMED</additionalJvmArg>
+
<additionalJvmArg>--add-opens=java.base/java.lang=ALL-UNNAMED</additionalJvmArg>
+
<additionalJvmArg>--add-opens=java.base/java.util=ALL-UNNAMED</additionalJvmArg>
+
<additionalJvmArg>--add-opens=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED</additionalJvmArg>
+ </additionalJvmArgs>
</configuration>
</plugin>
</plugins>
diff --git a/tika-grpc/run-dev.sh b/tika-grpc/run-dev.sh
index 604e70029..94929ee58 100755
--- a/tika-grpc/run-dev.sh
+++ b/tika-grpc/run-dev.sh
@@ -17,4 +17,10 @@ echo "Plugins will be loaded from target/classes directories"
echo "Press Ctrl+C to stop the server"
echo ""
+export MAVEN_OPTS="--add-opens=java.base/java.nio=ALL-UNNAMED \
+--add-opens=java.base/sun.nio.ch=ALL-UNNAMED \
+--add-opens=java.base/java.lang=ALL-UNNAMED \
+--add-opens=java.base/java.util=ALL-UNNAMED \
+--add-opens=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED"
+
mvn exec:java -Pdev -Dconfig.file="$CONFIG_FILE"
diff --git
a/tika-grpc/src/main/java/org/apache/tika/pipes/grpc/TikaGrpcServerImpl.java
b/tika-grpc/src/main/java/org/apache/tika/pipes/grpc/TikaGrpcServerImpl.java
index c7c3dd9bf..de318e3fb 100644
--- a/tika-grpc/src/main/java/org/apache/tika/pipes/grpc/TikaGrpcServerImpl.java
+++ b/tika-grpc/src/main/java/org/apache/tika/pipes/grpc/TikaGrpcServerImpl.java
@@ -63,6 +63,8 @@ import org.apache.tika.pipes.api.fetcher.FetchKey;
import org.apache.tika.pipes.api.fetcher.Fetcher;
import org.apache.tika.pipes.core.PipesClient;
import org.apache.tika.pipes.core.PipesConfig;
+import org.apache.tika.pipes.core.config.ConfigStore;
+import org.apache.tika.pipes.core.config.ConfigStoreFactory;
import org.apache.tika.pipes.core.fetcher.FetcherManager;
import org.apache.tika.plugins.ExtensionConfig;
import org.apache.tika.plugins.TikaPluginManager;
@@ -108,7 +110,21 @@ class TikaGrpcServerImpl extends TikaGrpc.TikaImplBase {
pluginManager = new org.pf4j.DefaultPluginManager();
}
- fetcherManager = FetcherManager.load(pluginManager, tikaJsonConfig,
true);
+ ConfigStore configStore = createConfigStore();
+
+ fetcherManager = FetcherManager.load(pluginManager, tikaJsonConfig,
true, configStore);
+ }
+
+ private ConfigStore createConfigStore() throws TikaConfigException {
+ String configStoreType = pipesConfig.getConfigStoreType();
+ String configStoreParams = pipesConfig.getConfigStoreParams();
+ ExtensionConfig storeConfig = new ExtensionConfig(
+ configStoreType, configStoreType, configStoreParams);
+
+ return ConfigStoreFactory.createConfigStore(
+ pluginManager,
+ configStoreType,
+ storeConfig);
}
@Override
diff --git a/tika-grpc/src/test/resources/tika-config-ignite.json
b/tika-grpc/src/test/resources/tika-config-ignite.json
new file mode 100644
index 000000000..7f813c728
--- /dev/null
+++ b/tika-grpc/src/test/resources/tika-config-ignite.json
@@ -0,0 +1,24 @@
+{
+ "pipes": {
+ "configStoreType": "ignite",
+ "configStoreParams":
"{\"cacheName\":\"my-tika-cache\",\"cacheMode\":\"REPLICATED\",\"igniteInstanceName\":\"MyTikaCluster\",\"autoClose\":true}"
+ },
+ "fetchers": [
+ {
+ "id": "fs",
+ "name": "file-system",
+ "params": {
+ "basePath": "/tmp/input"
+ }
+ }
+ ],
+ "emitters": [
+ {
+ "id": "fs",
+ "name": "file-system",
+ "params": {
+ "basePath": "/tmp/output"
+ }
+ }
+ ]
+}
diff --git a/tika-parent/pom.xml b/tika-parent/pom.xml
index 4868e3b05..8f72140f4 100644
--- a/tika-parent/pom.xml
+++ b/tika-parent/pom.xml
@@ -1491,19 +1491,19 @@
<version>${puppycrawl.version}</version>
</dependency>
</dependencies>
+ <configuration>
+
<configLocation>${maven.multiModuleProjectDirectory}/tika-parent/checkstyle.xml</configLocation>
+ <consoleOutput>false</consoleOutput>
+ <includeTestSourceDirectory>true</includeTestSourceDirectory>
+
<testSourceDirectories>${project.basedir}/src/test/java</testSourceDirectories>
+ <violationSeverity>warn</violationSeverity>
+ <failOnViolation>true</failOnViolation>
+ <failsOnError>true</failsOnError>
+ </configuration>
<executions>
<execution>
<id>validate</id>
<phase>process-classes</phase>
- <configuration>
- <configLocation>checkstyle.xml</configLocation>
- <consoleOutput>false</consoleOutput>
- <includeTestSourceDirectory>true</includeTestSourceDirectory>
-
<testSourceDirectories>${project.basedir}/src/test/java</testSourceDirectories>
- <violationSeverity>warn</violationSeverity>
- <failOnViolation>true</failOnViolation>
- <failsOnError>true</failsOnError>
- </configuration>
<goals>
<goal>check</goal>
</goals>
diff --git
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/AbstractComponentManager.java
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/AbstractComponentManager.java
index 4bfe20383..859d8dbb2 100644
---
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/AbstractComponentManager.java
+++
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/AbstractComponentManager.java
@@ -68,6 +68,11 @@ public abstract class AbstractComponentManager<T extends
TikaExtension,
ConfigStore configStore) {
this.pluginManager = pluginManager;
this.configStore = configStore;
+ try {
+ configStore.init();
+ } catch (Exception e) {
+ throw new RuntimeException("Failed to initialize ConfigStore", e);
+ }
componentConfigs.forEach(configStore::put);
this.allowRuntimeModifications = allowRuntimeModifications;
}
diff --git
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/PipesConfig.java
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/PipesConfig.java
index c70f16b1d..8daae166f 100644
---
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/PipesConfig.java
+++
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/PipesConfig.java
@@ -87,6 +87,18 @@ public class PipesConfig {
private ArrayList<String> forkedJvmArgs = new ArrayList<>();
private String javaPath = "java";
+
+ /**
+ * Type of ConfigStore to use for distributed state management.
+ * Options: "memory" (default), "ignite"
+ */
+ private String configStoreType = "memory";
+
+ /**
+ * JSON configuration parameters for the ConfigStore.
+ * The structure depends on the configStoreType selected.
+ */
+ private String configStoreParams = "{}";
/**
* Loads PipesConfig from the "pipes" section of the JSON configuration.
@@ -348,4 +360,20 @@ public class PipesConfig {
public void setStopOnlyOnFatal(boolean stopOnlyOnFatal) {
this.stopOnlyOnFatal = stopOnlyOnFatal;
}
+
+ public String getConfigStoreType() {
+ return configStoreType;
+ }
+
+ public void setConfigStoreType(String configStoreType) {
+ this.configStoreType = configStoreType;
+ }
+
+ public String getConfigStoreParams() {
+ return configStoreParams;
+ }
+
+ public void setConfigStoreParams(String configStoreParams) {
+ this.configStoreParams = configStoreParams;
+ }
}
diff --git
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStore.java
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStore.java
index 2f6c4c164..73d73ff7e 100644
---
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStore.java
+++
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStore.java
@@ -19,6 +19,7 @@ package org.apache.tika.pipes.core.config;
import java.util.Set;
import org.apache.tika.plugins.ExtensionConfig;
+import org.apache.tika.plugins.TikaExtension;
/**
* Interface for storing and retrieving component configurations.
@@ -32,7 +33,18 @@ import org.apache.tika.plugins.ExtensionConfig;
* <b>Performance considerations:</b> The {@link #keySet()} method should be
an inexpensive operation
* as it may be called in error message generation and other scenarios where
performance matters.
*/
-public interface ConfigStore {
+public interface ConfigStore extends TikaExtension {
+
+ /**
+ * Initializes the configuration store.
+ * This method should be called once before using the store.
+ * Implementations may use this to establish connections, initialize
caches, etc.
+ *
+ * @throws Exception if initialization fails
+ */
+ default void init() throws Exception {
+ // Default implementation does nothing
+ }
/**
* Stores a configuration.
diff --git
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStoreFactory.java
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStoreFactory.java
new file mode 100644
index 000000000..1393ace0b
--- /dev/null
+++
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStoreFactory.java
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.core.config;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.pf4j.PluginManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.tika.exception.TikaConfigException;
+import org.apache.tika.plugins.ExtensionConfig;
+import org.apache.tika.plugins.TikaExtensionFactory;
+
+/**
+ * Factory interface for creating ConfigStore instances.
+ * Implementations should be annotated with @Extension to be discovered by
PF4J.
+ */
+public interface ConfigStoreFactory extends TikaExtensionFactory<ConfigStore> {
+
+ Logger LOG = LoggerFactory.getLogger(ConfigStoreFactory.class);
+
+ /**
+ * Creates a ConfigStore instance based on configuration.
+ *
+ * @param pluginManager the plugin manager
+ * @param configStoreType the type of ConfigStore to create
+ * @param extensionConfig optional configuration for the store
+ * @return a ConfigStore instance
+ * @throws TikaConfigException if the store cannot be created
+ */
+ static ConfigStore createConfigStore(PluginManager pluginManager, String
configStoreType,
+ ExtensionConfig extensionConfig)
+ throws TikaConfigException {
+ if (configStoreType == null || configStoreType.isEmpty() ||
"memory".equalsIgnoreCase(configStoreType)) {
+ LOG.info("Creating InMemoryConfigStore");
+ InMemoryConfigStore store = new InMemoryConfigStore();
+ if (extensionConfig != null) {
+ store.setExtensionConfig(extensionConfig);
+ }
+ return store;
+ }
+
+ Map<String, ConfigStoreFactory> factoryMap =
loadAllConfigStoreFactoryExtensions(pluginManager);
+
+ ConfigStoreFactory factory = factoryMap.get(configStoreType);
+ if (factory != null) {
+ return configStoreByConfigByFactoryName(configStoreType,
extensionConfig, factory);
+ }
+ return configStoreByFullyQualifiedClassName(configStoreType,
extensionConfig, factoryMap);
+ }
+
+ private static ConfigStore configStoreByConfigByFactoryName(String
configStoreType, ExtensionConfig extensionConfig, ConfigStoreFactory factory)
throws TikaConfigException {
+ LOG.info("Creating ConfigStore using factory: {}", factory.getName());
+ try {
+ ExtensionConfig config = extensionConfig != null ? extensionConfig
:
+ new ExtensionConfig(configStoreType, configStoreType, "{}");
+ return factory.buildExtension(config);
+ } catch (IOException e) {
+ throw new TikaConfigException("Failed to create ConfigStore: " +
configStoreType, e);
+ }
+ }
+
+ private static ConfigStore configStoreByFullyQualifiedClassName(String
configStoreType,
+ ExtensionConfig extensionConfig,
+ Map<String, ConfigStoreFactory> factoryMap) throws
TikaConfigException {
+ try {
+ LOG.info("Creating ConfigStore from class: {}", configStoreType);
+ Class<?> storeClass = Class.forName(configStoreType);
+ if (!ConfigStore.class.isAssignableFrom(storeClass)) {
+ throw new TikaConfigException(
+ "Class " + configStoreType + " does not implement
ConfigStore interface");
+ }
+
+ // Try constructor with ExtensionConfig parameter first
+ ConfigStore store;
+ if (extensionConfig != null) {
+ try {
+ store = (ConfigStore) storeClass
+ .getDeclaredConstructor(ExtensionConfig.class)
+ .newInstance(extensionConfig);
+ return store;
+ } catch (NoSuchMethodException e) {
+ // Fall through to no-arg constructor
+ }
+ }
+
+ // Use no-arg constructor
+ store = (ConfigStore)
storeClass.getDeclaredConstructor().newInstance();
+
+ // Set extension config if the store implements the method
+ if (extensionConfig != null && store instanceof
InMemoryConfigStore) {
+ ((InMemoryConfigStore)
store).setExtensionConfig(extensionConfig);
+ }
+
+ return store;
+ } catch (ClassNotFoundException e) {
+ throw new TikaConfigException(
+ "Unknown ConfigStore type: " + configStoreType +
+ ". Available types: memory, " + String.join(", ",
factoryMap.keySet()), e);
+ } catch (Exception e) {
+ throw new TikaConfigException("Failed to instantiate ConfigStore:
" + configStoreType, e);
+ }
+ }
+
+ private static Map<String, ConfigStoreFactory>
loadAllConfigStoreFactoryExtensions(PluginManager pluginManager) {
+ List<ConfigStoreFactory> factories =
pluginManager.getExtensions(ConfigStoreFactory.class);
+ Map<String, ConfigStoreFactory> factoryMap = new HashMap<>();
+ for (ConfigStoreFactory factory : factories) {
+ factoryMap.put(factory.getName(), factory);
+ }
+ return factoryMap;
+ }
+}
diff --git
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/InMemoryConfigStore.java
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/InMemoryConfigStore.java
index e746eb4c7..7b6dab100 100644
---
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/InMemoryConfigStore.java
+++
b/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/InMemoryConfigStore.java
@@ -28,6 +28,16 @@ import org.apache.tika.plugins.ExtensionConfig;
public class InMemoryConfigStore implements ConfigStore {
private final ConcurrentHashMap<String, ExtensionConfig> store = new
ConcurrentHashMap<>();
+ private ExtensionConfig extensionConfig;
+
+ @Override
+ public ExtensionConfig getExtensionConfig() {
+ return extensionConfig;
+ }
+
+ public void setExtensionConfig(ExtensionConfig extensionConfig) {
+ this.extensionConfig = extensionConfig;
+ }
@Override
public void put(String id, ExtensionConfig config) {
diff --git
a/tika-pipes/tika-pipes-core/src/test/java/org/apache/tika/pipes/core/config/LoggingConfigStore.java
b/tika-pipes/tika-pipes-core/src/test/java/org/apache/tika/pipes/core/config/LoggingConfigStore.java
index fdf6a4190..8ef299b35 100644
---
a/tika-pipes/tika-pipes-core/src/test/java/org/apache/tika/pipes/core/config/LoggingConfigStore.java
+++
b/tika-pipes/tika-pipes-core/src/test/java/org/apache/tika/pipes/core/config/LoggingConfigStore.java
@@ -35,6 +35,12 @@ public class LoggingConfigStore implements ConfigStore {
private static final Logger LOG =
LoggerFactory.getLogger(LoggingConfigStore.class);
private final Map<String, ExtensionConfig> store = new HashMap<>();
+ private ExtensionConfig extensionConfig;
+
+ @Override
+ public ExtensionConfig getExtensionConfig() {
+ return extensionConfig;
+ }
@Override
public void put(String id, ExtensionConfig config) {
diff --git a/tika-pipes/tika-pipes-plugins/pom.xml
b/tika-pipes/tika-pipes-plugins/pom.xml
index abc9314f6..d33378351 100644
--- a/tika-pipes/tika-pipes-plugins/pom.xml
+++ b/tika-pipes/tika-pipes-plugins/pom.xml
@@ -37,6 +37,7 @@
<module>tika-pipes-file-system</module>
<module>tika-pipes-gcs</module>
<module>tika-pipes-http</module>
+ <module>tika-pipes-ignite</module>
<module>tika-pipes-jdbc</module>
<module>tika-pipes-json</module>
<module>tika-pipes-kafka</module>
diff --git a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/README.md
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/README.md
new file mode 100644
index 000000000..f2761a2c8
--- /dev/null
+++ b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/README.md
@@ -0,0 +1,68 @@
+# Apache Tika Pipes - Ignite ConfigStore Plugin
+
+This plugin provides distributed configuration storage using Apache Ignite for
Tika Pipes.
+
+## Features
+
+- Distributed configuration storage across multiple Tika servers
+- Support for REPLICATED and PARTITIONED cache modes
+- Automatic cluster discovery and coordination
+- Thread-safe configuration updates
+
+## Configuration
+
+### Using Factory Name
+
+```json
+{
+ "pipes": {
+ "configStoreType": "ignite",
+ "configStoreParams":
"{\"cacheName\":\"my-tika-cache\",\"cacheMode\":\"REPLICATED\",\"igniteInstanceName\":\"MyTikaCluster\"}"
+ }
+}
+```
+
+### Configuration Parameters
+
+- **cacheName** (optional): Name of the Ignite cache. Default:
`tika-config-cache`
+- **cacheMode** (optional): Either `REPLICATED` or `PARTITIONED`. Default:
`REPLICATED`
+- **igniteInstanceName** (optional): Name for the Ignite instance. Default:
`TikaIgnite`
+- **autoClose** (optional): Whether to automatically close Ignite on shutdown.
Default: `true`
+
+### Cache Modes
+
+- **REPLICATED**: All nodes have a full copy of the data. Best for small
datasets that need fast reads.
+- **PARTITIONED**: Data is distributed across nodes. Better for large datasets
and scalability.
+
+## Important Security Note
+
+⚠️ **H2 Database Dependency**
+
+Apache Ignite 2.16.0 requires H2 database version 1.4.197, which contains
known security vulnerabilities (CVEs).
+This is a hard requirement due to Ignite's internal SQL engine dependencies -
newer H2 2.x versions are incompatible.
+
+**Risk Mitigation:**
+- The H2 database is used internally by Ignite and is NOT exposed externally
+- The ConfigStore only stores Tika pipeline configuration, not user data
+- H2 is embedded and does not accept external connections
+- No user SQL queries are executed against H2
+
+**Alternatives for Security-Sensitive Environments:**
+- Use the in-memory ConfigStore for single-node deployments
+- Consider migrating to Apache Ignite with Calcite SQL engine (available since
Ignite 2.13+)
+- Wait for Apache Ignite 3.x which removes the H2 dependency entirely
+
+If you have strict security requirements, we recommend evaluating these
alternatives or implementing
+additional network isolation to ensure the Ignite cluster is not accessible
from untrusted networks.
+
+## Usage Example
+
+```java
+// Configuration is automatically loaded from tika-config.json
+// The Ignite cluster will form automatically across all nodes
+// with the same igniteInstanceName
+```
+
+## Development
+
+See the main tika-grpc README for information about running in development
mode with plugin hot-reloading.
diff --git a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/pom.xml
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/pom.xml
new file mode 100644
index 000000000..da790ceb0
--- /dev/null
+++ b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/pom.xml
@@ -0,0 +1,184 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
https://maven.apache.org/xsd/maven-4.0.0.xsd">
+ <parent>
+ <artifactId>tika-pipes-plugins</artifactId>
+ <groupId>org.apache.tika</groupId>
+ <version>4.0.0-SNAPSHOT</version>
+ </parent>
+ <modelVersion>4.0.0</modelVersion>
+
+ <artifactId>tika-pipes-ignite</artifactId>
+ <name>Apache Tika Pipes Apache Ignite</name>
+ <packaging>jar</packaging>
+
+ <properties>
+ <ignite.version>2.16.0</ignite.version>
+ <!-- Ignite 2.16.0 requires H2 1.4.x - not compatible with 2.x -->
+ <h2.version>1.4.197</h2.version>
+
<plugin.excluded.artifactIds>tika-core,tika-pipes-api,tika-pipes-core,tika-serialization,tika-plugins-core</plugin.excluded.artifactIds>
+
<plugin.excluded.groupIds>org.apache.logging.log4j,org.slf4j</plugin.excluded.groupIds>
+ </properties>
+
+ <dependencies>
+ <dependency>
+ <groupId>${project.groupId}</groupId>
+ <artifactId>tika-pipes-core</artifactId>
+ <version>${project.version}</version>
+ </dependency>
+ <dependency>
+ <groupId>${project.groupId}</groupId>
+ <artifactId>tika-pipes-api</artifactId>
+ <version>${project.version}</version>
+ <scope>provided</scope>
+ </dependency>
+ <dependency>
+ <groupId>${project.groupId}</groupId>
+ <artifactId>tika-core</artifactId>
+ <version>${project.version}</version>
+ <scope>provided</scope>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.logging.log4j</groupId>
+ <artifactId>log4j-slf4j2-impl</artifactId>
+ <scope>provided</scope>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.ignite</groupId>
+ <artifactId>ignite-core</artifactId>
+ <version>${ignite.version}</version>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.ignite</groupId>
+ <artifactId>ignite-indexing</artifactId>
+ <version>${ignite.version}</version>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.ignite</groupId>
+ <artifactId>ignite-spring</artifactId>
+ <version>${ignite.version}</version>
+ <exclusions>
+ <exclusion>
+ <groupId>org.springframework</groupId>
+ <artifactId>spring-core</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.springframework</groupId>
+ <artifactId>spring-beans</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.springframework</groupId>
+ <artifactId>spring-context</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+ <dependency>
+ <groupId>com.h2database</groupId>
+ <artifactId>h2</artifactId>
+ <version>${h2.version}</version>
+ </dependency>
+ <dependency>
+ <groupId>${project.groupId}</groupId>
+ <artifactId>tika-core</artifactId>
+ <version>${project.version}</version>
+ <type>test-jar</type>
+ <scope>test</scope>
+ </dependency>
+ </dependencies>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-jar-plugin</artifactId>
+ <configuration>
+ <archive>
+ <manifestEntries>
+
<Automatic-Module-Name>org.apache.tika.pipes.ignite</Automatic-Module-Name>
+ </manifestEntries>
+ </archive>
+ </configuration>
+ </plugin>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-dependency-plugin</artifactId>
+ <executions>
+ <execution>
+ <id>copy-dependencies</id>
+ <phase>package</phase>
+ <goals>
+ <goal>copy-dependencies</goal>
+ </goals>
+ <configuration>
+ <outputDirectory>${project.build.directory}/lib</outputDirectory>
+ <includeScope>runtime</includeScope>
+
<excludeArtifactIds>${plugin.excluded.artifactIds}</excludeArtifactIds>
+ <excludeGroupIds>${plugin.excluded.groupIds}</excludeGroupIds>
+ </configuration>
+ </execution>
+ </executions>
+ </plugin>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-assembly-plugin</artifactId>
+ <configuration>
+ <descriptors>
+ <descriptor>src/main/assembly/assembly.xml</descriptor>
+ </descriptors>
+ <finalName>${project.artifactId}-${project.version}</finalName>
+ <appendAssemblyId>false</appendAssemblyId>
+ </configuration>
+ <executions>
+ <execution>
+ <id>make-assembly</id>
+ <phase>package</phase>
+ <goals>
+ <goal>single</goal>
+ </goals>
+ </execution>
+ </executions>
+ </plugin>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <configuration>
+ <annotationProcessors>
+
<annotationProcessor>org.pf4j.processor.ExtensionAnnotationProcessor</annotationProcessor>
+ </annotationProcessors>
+ </configuration>
+ </plugin>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-surefire-plugin</artifactId>
+ <configuration>
+ <argLine>
+ --add-opens java.base/java.nio=ALL-UNNAMED
+ --add-opens java.base/java.util=ALL-UNNAMED
+ --add-opens java.base/java.lang=ALL-UNNAMED
+ --add-opens java.base/java.io=ALL-UNNAMED
+ --add-opens java.base/sun.nio.ch=ALL-UNNAMED
+ --add-opens java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED
+ --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED
+ </argLine>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+</project>
diff --git
a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/assembly/assembly.xml
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/assembly/assembly.xml
new file mode 100644
index 000000000..ea0f8b4a1
--- /dev/null
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/assembly/assembly.xml
@@ -0,0 +1,55 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<assembly xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xmlns="http://maven.apache.org/ASSEMBLY/2.0.0"
+ xsi:schemaLocation="http://maven.apache.org/ASSEMBLY/2.0.0
+ http://maven.apache.org/xsd/assembly-2.0.0.xsd">
+ <id>dependencies-zip</id>
+ <formats>
+ <format>zip</format>
+ </formats>
+ <includeBaseDirectory>false</includeBaseDirectory>
+ <fileSets>
+ <fileSet>
+ <directory>${project.build.directory}/lib</directory>
+ <outputDirectory>/lib</outputDirectory>
+ </fileSet>
+ <fileSet>
+ <directory>${project.build.directory}</directory>
+ <outputDirectory>/lib</outputDirectory>
+ <includes>
+ <include>${project.artifactId}-${project.version}.jar</include>
+ </includes>
+ </fileSet>
+ <fileSet>
+ <directory>${project.build.directory}</directory>
+ <outputDirectory>/</outputDirectory>
+ <includes>
+ <include>classes/META-INF/extensions.idx</include>
+ <include>classes/META-INF/MANIFEST.MF</include>
+ </includes>
+ </fileSet>
+ <fileSet>
+ <directory>${project.basedir}/src/main/resources</directory>
+ <outputDirectory>/</outputDirectory>
+ <includes>
+ <include>plugin.properties</include>
+ </includes>
+ </fileSet>
+ </fileSets>
+</assembly>
diff --git
a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/ExtensionConfigDTO.java
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/ExtensionConfigDTO.java
new file mode 100644
index 000000000..26e8cdb05
--- /dev/null
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/ExtensionConfigDTO.java
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.ignite;
+
+import java.io.Serializable;
+
+import org.apache.tika.plugins.ExtensionConfig;
+
+/**
+ * Serializable wrapper for ExtensionConfig to work with Ignite's binary
serialization.
+ * Since ExtensionConfig is a Java record with final fields, it cannot be
directly
+ * serialized by Ignite. This DTO provides mutable fields that Ignite can work
with.
+ */
+public class ExtensionConfigDTO implements Serializable {
+ private static final long serialVersionUID = 1L;
+
+ private String id;
+ private String name;
+ private String json;
+
+ public ExtensionConfigDTO() {
+ }
+
+ public ExtensionConfigDTO(String id, String name, String json) {
+ this.id = id;
+ this.name = name;
+ this.json = json;
+ }
+
+ public ExtensionConfigDTO(ExtensionConfig config) {
+ this.id = config.id();
+ this.name = config.name();
+ this.json = config.json();
+ }
+
+ public ExtensionConfig toExtensionConfig() {
+ return new ExtensionConfig(id, name, json);
+ }
+
+ public String getId() {
+ return id;
+ }
+
+ public void setId(String id) {
+ this.id = id;
+ }
+
+ public String getName() {
+ return name;
+ }
+
+ public void setName(String name) {
+ this.name = name;
+ }
+
+ public String getJson() {
+ return json;
+ }
+
+ public void setJson(String json) {
+ this.json = json;
+ }
+}
diff --git
a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStore.java
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStore.java
new file mode 100644
index 000000000..e2f8a6c74
--- /dev/null
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStore.java
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.ignite;
+
+import java.util.Set;
+
+import org.apache.ignite.Ignite;
+import org.apache.ignite.IgniteCache;
+import org.apache.ignite.Ignition;
+import org.apache.ignite.cache.CacheMode;
+import org.apache.ignite.configuration.CacheConfiguration;
+import org.apache.ignite.configuration.IgniteConfiguration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.tika.exception.TikaConfigException;
+import org.apache.tika.pipes.core.config.ConfigStore;
+import org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig;
+import org.apache.tika.plugins.ExtensionConfig;
+
+/**
+ * Apache Ignite-based implementation of {@link ConfigStore}.
+ * Provides distributed configuration storage for Tika Pipes clustering.
+ * <p>
+ * This implementation is thread-safe and suitable for multi-instance
deployments
+ * where configurations need to be shared across multiple servers.
+ * <p>
+ * Configuration options:
+ * <ul>
+ * <li>cacheName - Name of the Ignite cache (default:
"tika-config-store")</li>
+ * <li>cacheMode - Cache replication mode: PARTITIONED or REPLICATED
(default: REPLICATED)</li>
+ * <li>igniteInstanceName - Name of the Ignite instance (default:
"TikaIgniteConfigStore")</li>
+ * </ul>
+ */
+public class IgniteConfigStore implements ConfigStore {
+
+ private static final Logger LOG =
LoggerFactory.getLogger(IgniteConfigStore.class);
+ private static final String DEFAULT_CACHE_NAME = "tika-config-store";
+ private static final String DEFAULT_INSTANCE_NAME =
"TikaIgniteConfigStore";
+
+ private Ignite ignite;
+ private IgniteCache<String, ExtensionConfigDTO> cache;
+ private String cacheName = DEFAULT_CACHE_NAME;
+ private CacheMode cacheMode = CacheMode.REPLICATED;
+ private String igniteInstanceName = DEFAULT_INSTANCE_NAME;
+ private boolean autoClose = true;
+ private ExtensionConfig extensionConfig;
+ private boolean closed = false;
+
+ public IgniteConfigStore() {
+ }
+
+ public IgniteConfigStore(ExtensionConfig extensionConfig) throws
TikaConfigException {
+ this.extensionConfig = extensionConfig;
+
+ IgniteConfigStoreConfig config =
IgniteConfigStoreConfig.load(extensionConfig.json());
+ this.cacheName = config.getCacheName();
+ this.cacheMode = config.getCacheModeEnum();
+ this.igniteInstanceName = config.getIgniteInstanceName();
+ this.autoClose = config.isAutoClose();
+ }
+
+ public IgniteConfigStore(String cacheName) {
+ this.cacheName = cacheName;
+ }
+
+ @Override
+ public ExtensionConfig getExtensionConfig() {
+ return extensionConfig;
+ }
+
+ @Override
+ public void init() throws Exception {
+ if (closed) {
+ throw new IllegalStateException("Cannot reinitialize
IgniteConfigStore after it has been closed");
+ }
+ if (ignite != null) {
+ LOG.warn("IgniteConfigStore already initialized");
+ return;
+ }
+
+ LOG.info("Initializing IgniteConfigStore with cache: {}, mode: {},
instance: {}",
+ cacheName, cacheMode, igniteInstanceName);
+
+ IgniteConfiguration cfg = new IgniteConfiguration();
+ cfg.setIgniteInstanceName(igniteInstanceName);
+ cfg.setClientMode(false);
+
+ ignite = Ignition.start(cfg);
+
+ CacheConfiguration<String, ExtensionConfigDTO> cacheCfg = new
CacheConfiguration<>(cacheName);
+ cacheCfg.setCacheMode(cacheMode);
+ cacheCfg.setBackups(cacheMode == CacheMode.PARTITIONED ? 1 : 0);
+
+ cache = ignite.getOrCreateCache(cacheCfg);
+ LOG.info("IgniteConfigStore initialized successfully");
+ }
+
+ @Override
+ public void put(String id, ExtensionConfig config) {
+ if (cache == null) {
+ throw new IllegalStateException("IgniteConfigStore not
initialized. Call init() first.");
+ }
+ cache.put(id, new ExtensionConfigDTO(config));
+ }
+
+ @Override
+ public ExtensionConfig get(String id) {
+ if (cache == null) {
+ throw new IllegalStateException("IgniteConfigStore not
initialized. Call init() first.");
+ }
+ ExtensionConfigDTO dto = cache.get(id);
+ return dto != null ? dto.toExtensionConfig() : null;
+ }
+
+ @Override
+ public boolean containsKey(String id) {
+ if (cache == null) {
+ throw new IllegalStateException("IgniteConfigStore not
initialized. Call init() first.");
+ }
+ return cache.containsKey(id);
+ }
+
+ @Override
+ public Set<String> keySet() {
+ if (cache == null) {
+ throw new IllegalStateException("IgniteConfigStore not
initialized. Call init() first.");
+ }
+ return Set.copyOf(cache.query(new
org.apache.ignite.cache.query.ScanQuery<String, ExtensionConfigDTO>())
+ .getAll()
+ .stream()
+ .map(javax.cache.Cache.Entry::getKey)
+ .toList());
+ }
+
+ @Override
+ public int size() {
+ if (cache == null) {
+ throw new IllegalStateException("IgniteConfigStore not
initialized. Call init() first.");
+ }
+ return cache.size();
+ }
+
+ public void close() {
+ if (ignite != null && autoClose) {
+ LOG.info("Closing IgniteConfigStore");
+ ignite.close();
+ ignite = null;
+ cache = null;
+ closed = true;
+ }
+ }
+
+ public void setCacheName(String cacheName) {
+ this.cacheName = cacheName;
+ }
+
+ public void setCacheMode(CacheMode cacheMode) {
+ this.cacheMode = cacheMode;
+ }
+
+ public void setIgniteInstanceName(String igniteInstanceName) {
+ this.igniteInstanceName = igniteInstanceName;
+ }
+
+ public void setAutoClose(boolean autoClose) {
+ this.autoClose = autoClose;
+ }
+}
diff --git
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/InMemoryConfigStore.java
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStoreFactory.java
similarity index 50%
copy from
tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/InMemoryConfigStore.java
copy to
tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStoreFactory.java
index e746eb4c7..c527b5c67 100644
---
a/tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/InMemoryConfigStore.java
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStoreFactory.java
@@ -14,43 +14,39 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-package org.apache.tika.pipes.core.config;
+package org.apache.tika.pipes.ignite;
-import java.util.Set;
-import java.util.concurrent.ConcurrentHashMap;
+import java.io.IOException;
+import org.pf4j.Extension;
+
+import org.apache.tika.exception.TikaConfigException;
+import org.apache.tika.pipes.core.config.ConfigStore;
+import org.apache.tika.pipes.core.config.ConfigStoreFactory;
import org.apache.tika.plugins.ExtensionConfig;
/**
- * Default in-memory implementation of {@link ConfigStore} using a {@link
ConcurrentHashMap}.
- * Thread-safe and suitable for single-instance deployments.
+ * Factory for creating Ignite-based ConfigStore instances.
*/
-public class InMemoryConfigStore implements ConfigStore {
-
- private final ConcurrentHashMap<String, ExtensionConfig> store = new
ConcurrentHashMap<>();
-
- @Override
- public void put(String id, ExtensionConfig config) {
- store.put(id, config);
- }
+@Extension
+public class IgniteConfigStoreFactory implements ConfigStoreFactory {
- @Override
- public ExtensionConfig get(String id) {
- return store.get(id);
- }
-
- @Override
- public boolean containsKey(String id) {
- return store.containsKey(id);
- }
+ private static final String NAME = "ignite";
@Override
- public Set<String> keySet() {
- return Set.copyOf(store.keySet());
+ public String getName() {
+ return NAME;
}
@Override
- public int size() {
- return store.size();
+ public ConfigStore buildExtension(ExtensionConfig extensionConfig)
+ throws IOException, TikaConfigException {
+ try {
+ return new IgniteConfigStore(extensionConfig);
+ } catch (TikaConfigException e) {
+ throw e;
+ } catch (Exception e) {
+ throw new TikaConfigException("Failed to create
IgniteConfigStore", e);
+ }
}
}
diff --git
a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/config/IgniteConfigStoreConfig.java
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/config/IgniteConfigStoreConfig.java
new file mode 100644
index 000000000..6ec7699ec
--- /dev/null
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/config/IgniteConfigStoreConfig.java
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.ignite.config;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import org.apache.ignite.cache.CacheMode;
+
+import org.apache.tika.exception.TikaConfigException;
+
+/**
+ * Configuration for IgniteConfigStore.
+ *
+ * Example JSON configuration:
+ * <pre>
+ * {
+ * "cacheName": "my-tika-cache",
+ * "cacheMode": "REPLICATED",
+ * "igniteInstanceName": "MyTikaCluster",
+ * "autoClose": true
+ * }
+ * </pre>
+ */
+public class IgniteConfigStoreConfig {
+
+ private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
+
+ public static IgniteConfigStoreConfig load(final String json) throws
TikaConfigException {
+ try {
+ if (json == null || json.trim().isEmpty() ||
"{}".equals(json.trim())) {
+ return new IgniteConfigStoreConfig();
+ }
+ return OBJECT_MAPPER.readValue(json,
IgniteConfigStoreConfig.class);
+ } catch (JsonProcessingException e) {
+ throw new TikaConfigException("Failed to parse
IgniteConfigStoreConfig from JSON", e);
+ }
+ }
+
+ private String cacheName = "tika-config-store";
+ private String cacheMode = "REPLICATED";
+ private String igniteInstanceName = "TikaIgniteConfigStore";
+ private boolean autoClose = true;
+
+ public String getCacheName() {
+ return cacheName;
+ }
+
+ public IgniteConfigStoreConfig setCacheName(String cacheName) {
+ this.cacheName = cacheName;
+ return this;
+ }
+
+ public String getCacheMode() {
+ return cacheMode;
+ }
+
+ public IgniteConfigStoreConfig setCacheMode(String cacheMode) {
+ this.cacheMode = cacheMode;
+ return this;
+ }
+
+ public CacheMode getCacheModeEnum() {
+ if (cacheMode == null || cacheMode.trim().isEmpty()) {
+ return CacheMode.REPLICATED;
+ }
+
+ if ("PARTITIONED".equalsIgnoreCase(cacheMode)) {
+ return CacheMode.PARTITIONED;
+ }
+
+ if ("REPLICATED".equalsIgnoreCase(cacheMode)) {
+ return CacheMode.REPLICATED;
+ }
+
+ throw new IllegalArgumentException(
+ "Unsupported cacheMode: '" + cacheMode
+ + "'. Supported values are PARTITIONED and
REPLICATED.");
+ }
+
+ public String getIgniteInstanceName() {
+ return igniteInstanceName;
+ }
+
+ public IgniteConfigStoreConfig setIgniteInstanceName(String
igniteInstanceName) {
+ this.igniteInstanceName = igniteInstanceName;
+ return this;
+ }
+
+ public boolean isAutoClose() {
+ return autoClose;
+ }
+
+ public IgniteConfigStoreConfig setAutoClose(boolean autoClose) {
+ this.autoClose = autoClose;
+ return this;
+ }
+}
diff --git
a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/plugin/ignite/IgnitePipesPlugin.java
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/plugin/ignite/IgnitePipesPlugin.java
new file mode 100644
index 000000000..07b2f68d0
--- /dev/null
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/plugin/ignite/IgnitePipesPlugin.java
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.plugin.ignite;
+
+import org.pf4j.Plugin;
+import org.pf4j.PluginWrapper;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class IgnitePipesPlugin extends Plugin {
+ private static final Logger LOG =
LoggerFactory.getLogger(IgnitePipesPlugin.class);
+
+ public IgnitePipesPlugin(PluginWrapper wrapper) {
+ super(wrapper);
+ }
+
+ @Override
+ public void start() {
+ LOG.info("Starting Ignite Config Store Plugin");
+ super.start();
+ }
+
+ @Override
+ public void stop() {
+ LOG.info("Stopping Ignite Config Store Plugin");
+ super.stop();
+ }
+
+ @Override
+ public void delete() {
+ LOG.info("Deleting Ignite Config Store Plugin");
+ super.delete();
+ }
+}
diff --git
a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/resources/plugin.properties
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/resources/plugin.properties
new file mode 100644
index 000000000..da2660f1a
--- /dev/null
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/resources/plugin.properties
@@ -0,0 +1,22 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+plugin.id=tika-pipes-ignite-plugin
+plugin.class=org.apache.tika.pipes.plugin.ignite.IgnitePipesPlugin
+plugin.version=4.0.0-SNAPSHOT
+plugin.provider=Apache Tika
+plugin.description=Pipes for Apache Ignite Config Store
+
diff --git
a/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/test/java/org/apache/tika/pipes/ignite/IgniteConfigStoreTest.java
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/test/java/org/apache/tika/pipes/ignite/IgniteConfigStoreTest.java
new file mode 100644
index 000000000..bc72ad0b5
--- /dev/null
+++
b/tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/test/java/org/apache/tika/pipes/ignite/IgniteConfigStoreTest.java
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.ignite;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertFalse;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertNull;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+
+import org.junit.jupiter.api.AfterEach;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import org.apache.tika.plugins.ExtensionConfig;
+
+public class IgniteConfigStoreTest {
+
+ private IgniteConfigStore store;
+
+ @BeforeEach
+ public void setUp() throws Exception {
+ store = new IgniteConfigStore();
+ store.setIgniteInstanceName("TestIgniteInstance-" +
System.currentTimeMillis());
+ store.init();
+ }
+
+ @AfterEach
+ public void tearDown() {
+ if (store != null) {
+ store.close();
+ }
+ }
+
+ @Test
+ public void testPutAndGet() {
+ ExtensionConfig config = new ExtensionConfig("id1", "type1",
"{\"key\":\"value\"}");
+
+ store.put("id1", config);
+
+ ExtensionConfig retrieved = store.get("id1");
+ assertNotNull(retrieved);
+ assertEquals("id1", retrieved.id());
+ assertEquals("type1", retrieved.name());
+ assertEquals("{\"key\":\"value\"}", retrieved.json());
+ }
+
+ @Test
+ public void testContainsKey() {
+ ExtensionConfig config = new ExtensionConfig("id1", "type1", "{}");
+
+ assertFalse(store.containsKey("id1"));
+
+ store.put("id1", config);
+
+ assertTrue(store.containsKey("id1"));
+ assertFalse(store.containsKey("nonexistent"));
+ }
+
+ @Test
+ public void testSize() {
+ assertEquals(0, store.size());
+
+ store.put("id1", new ExtensionConfig("id1", "type1", "{}"));
+ assertEquals(1, store.size());
+
+ store.put("id2", new ExtensionConfig("id2", "type2", "{}"));
+ assertEquals(2, store.size());
+
+ store.put("id1", new ExtensionConfig("id1", "type1",
"{\"updated\":true}"));
+ assertEquals(2, store.size());
+ }
+
+ @Test
+ public void testKeySet() {
+ assertTrue(store.keySet().isEmpty());
+
+ store.put("id1", new ExtensionConfig("id1", "type1", "{}"));
+ store.put("id2", new ExtensionConfig("id2", "type2", "{}"));
+
+ assertEquals(2, store.keySet().size());
+ assertTrue(store.keySet().contains("id1"));
+ assertTrue(store.keySet().contains("id2"));
+ assertFalse(store.keySet().contains("id3"));
+ }
+
+ @Test
+ public void testGetNonExistent() {
+ assertNull(store.get("nonexistent"));
+ }
+
+ @Test
+ public void testUpdateExisting() {
+ ExtensionConfig config1 = new ExtensionConfig("id1", "type1",
"{\"version\":1}");
+ ExtensionConfig config2 = new ExtensionConfig("id1", "type1",
"{\"version\":2}");
+
+ store.put("id1", config1);
+ assertEquals("{\"version\":1}", store.get("id1").json());
+
+ store.put("id1", config2);
+ assertEquals("{\"version\":2}", store.get("id1").json());
+ assertEquals(1, store.size());
+ }
+
+ @Test
+ public void testMultipleConfigs() {
+ for (int i = 0; i < 10; i++) {
+ String id = "config" + i;
+ ExtensionConfig config = new ExtensionConfig(id, "type" + i,
"{\"index\":" + i + "}");
+ store.put(id, config);
+ }
+
+ assertEquals(10, store.size());
+
+ for (int i = 0; i < 10; i++) {
+ String id = "config" + i;
+ ExtensionConfig config = store.get(id);
+ assertNotNull(config);
+ assertEquals(id, config.id());
+ assertEquals("type" + i, config.name());
+ }
+ }
+
+ @Test
+ public void testUninitializedStore() {
+ IgniteConfigStore uninitializedStore = new IgniteConfigStore();
+
+ assertThrows(IllegalStateException.class, () -> {
+ uninitializedStore.put("id1", new ExtensionConfig("id1", "type1",
"{}"));
+ });
+
+ assertThrows(IllegalStateException.class, () -> {
+ uninitializedStore.get("id1");
+ });
+
+ assertThrows(IllegalStateException.class, () -> {
+ uninitializedStore.containsKey("id1");
+ });
+
+ assertThrows(IllegalStateException.class, () -> {
+ uninitializedStore.size();
+ });
+
+ assertThrows(IllegalStateException.class, () -> {
+ uninitializedStore.keySet();
+ });
+ }
+
+ @Test
+ public void testThreadSafety() throws InterruptedException {
+ int numThreads = 10;
+ int numOperationsPerThread = 100;
+
+ Thread[] threads = new Thread[numThreads];
+ for (int i = 0; i < numThreads; i++) {
+ final int threadId = i;
+ threads[i] = new Thread(() -> {
+ for (int j = 0; j < numOperationsPerThread; j++) {
+ String id = "thread" + threadId + "_config" + j;
+ ExtensionConfig config = new ExtensionConfig(id, "type",
"{}");
+ store.put(id, config);
+ assertNotNull(store.get(id));
+ }
+ });
+ threads[i].start();
+ }
+
+ for (Thread thread : threads) {
+ thread.join();
+ }
+
+ assertEquals(numThreads * numOperationsPerThread, store.size());
+ }
+
+ @Test
+ public void testCustomCacheName() throws Exception {
+ IgniteConfigStore customStore = new IgniteConfigStore("custom-cache");
+ customStore.setIgniteInstanceName("CustomInstance-" +
System.currentTimeMillis());
+
+ try {
+ customStore.init();
+
+ ExtensionConfig config = new ExtensionConfig("id1", "type1", "{}");
+ customStore.put("id1", config);
+
+ assertNotNull(customStore.get("id1"));
+ assertEquals("id1", customStore.get("id1").id());
+ } finally {
+ customStore.close();
+ }
+ }
+}