Copilot commented on code in PR #2470: URL: https://github.com/apache/tika/pull/2470#discussion_r2648429020
########## tika-pipes/tika-pipes-plugins/tika-pipes-ignite/pom.xml: ########## @@ -0,0 +1,187 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> + <parent> + <artifactId>tika-pipes-plugins</artifactId> + <groupId>org.apache.tika</groupId> + <version>4.0.0-SNAPSHOT</version> + </parent> + <modelVersion>4.0.0</modelVersion> + + <artifactId>tika-pipes-ignite</artifactId> + <name>Apache Tika Pipes Apache Ignite</name> + <packaging>jar</packaging> + + <properties> + <ignite.version>2.16.0</ignite.version> + <h2.version>1.4.197</h2.version> + <plugin.excluded.artifactIds>tika-core,tika-pipes-api,tika-pipes-core,tika-serialization,tika-plugins-core</plugin.excluded.artifactIds> + <plugin.excluded.groupIds>org.apache.logging.log4j,org.slf4j</plugin.excluded.groupIds> + </properties> + + <dependencies> + <dependency> + <groupId>${project.groupId}</groupId> + <artifactId>tika-pipes-core</artifactId> + <version>${project.version}</version> + </dependency> + <dependency> + <groupId>${project.groupId}</groupId> + <artifactId>tika-pipes-api</artifactId> + <version>${project.version}</version> + <scope>provided</scope> + </dependency> + <dependency> + <groupId>${project.groupId}</groupId> + <artifactId>tika-core</artifactId> + <version>${project.version}</version> + <scope>provided</scope> + </dependency> + <dependency> + <groupId>org.apache.logging.log4j</groupId> + <artifactId>log4j-slf4j2-impl</artifactId> + <scope>provided</scope> + </dependency> + <dependency> + <groupId>org.apache.ignite</groupId> + <artifactId>ignite-core</artifactId> + <version>${ignite.version}</version> + </dependency> + <dependency> + <groupId>org.apache.ignite</groupId> + <artifactId>ignite-indexing</artifactId> + <version>${ignite.version}</version> + </dependency> + <dependency> + <groupId>org.apache.ignite</groupId> + <artifactId>ignite-spring</artifactId> + <version>${ignite.version}</version> + <exclusions> + <exclusion> + <groupId>org.springframework</groupId> + <artifactId>spring-core</artifactId> + </exclusion> + <exclusion> + <groupId>org.springframework</groupId> + <artifactId>spring-beans</artifactId> + </exclusion> + <exclusion> + <groupId>org.springframework</groupId> + <artifactId>spring-context</artifactId> + </exclusion> + </exclusions> + </dependency> + <dependency> + <groupId>com.h2database</groupId> + <artifactId>h2</artifactId> + <version>${h2.version}</version> + </dependency> + <dependency> + <groupId>${project.groupId}</groupId> + <artifactId>tika-core</artifactId> + <version>${project.version}</version> + <type>test-jar</type> + <scope>test</scope> + </dependency> + </dependencies> + + <build> + <plugins> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-jar-plugin</artifactId> + <configuration> + <archive> + <manifestEntries> + <Automatic-Module-Name>org.apache.tika.pipes.ignite</Automatic-Module-Name> + </manifestEntries> + </archive> + </configuration> + </plugin> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-dependency-plugin</artifactId> + <executions> + <execution> + <id>copy-dependencies</id> + <phase>package</phase> + <goals> + <goal>copy-dependencies</goal> + </goals> + <configuration> + <outputDirectory>${project.build.directory}/lib</outputDirectory> + <includeScope>runtime</includeScope> + <excludeArtifactIds>${plugin.excluded.artifactIds}</excludeArtifactIds> + <excludeGroupIds>${plugin.excluded.groupIds}</excludeGroupIds> + </configuration> + </execution> + </executions> + </plugin> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-assembly-plugin</artifactId> + <configuration> + <descriptors> + <descriptor>src/main/assembly/assembly.xml</descriptor> + </descriptors> + <finalName>${project.artifactId}-${project.version}</finalName> + <appendAssemblyId>false</appendAssemblyId> + </configuration> + <executions> + <execution> + <id>make-assembly</id> + <phase>package</phase> + <goals> + <goal>single</goal> + </goals> + </execution> + </executions> + </plugin> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-compiler-plugin</artifactId> + <configuration> + <annotationProcessors> + <annotationProcessor>org.pf4j.processor.ExtensionAnnotationProcessor</annotationProcessor> + </annotationProcessors> + </configuration> + </plugin> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-surefire-plugin</artifactId> + <configuration> + <argLine> + --add-opens java.base/java.nio=ALL-UNNAMED + --add-opens java.base/java.util=ALL-UNNAMED + --add-opens java.base/java.lang=ALL-UNNAMED + --add-opens java.base/java.io=ALL-UNNAMED + --add-opens java.base/sun.nio.ch=ALL-UNNAMED + --add-opens java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED + --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED + </argLine> + </configuration> + </plugin> + </plugins> + </build> + + <scm> + <tag>3.0.0-rc1</tag> + </scm> Review Comment: The SCM tag 3.0.0-rc1 appears to be incorrect for a module in version 4.0.0-SNAPSHOT. This tag should either match the current development version or be removed to inherit from the parent POM. Other sibling modules have the same tag, which suggests this might be a legacy value that should be cleaned up. ```suggestion ``` ########## tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStoreFactory.java: ########## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.tika.pipes.core.config; + +import java.io.IOException; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.pf4j.PluginManager; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.tika.exception.TikaConfigException; +import org.apache.tika.plugins.ExtensionConfig; +import org.apache.tika.plugins.TikaExtensionFactory; + +/** + * Factory interface for creating ConfigStore instances. + * Implementations should be annotated with @Extension to be discovered by PF4J. + */ +public interface ConfigStoreFactory extends TikaExtensionFactory<ConfigStore> { + + Logger LOG = LoggerFactory.getLogger(ConfigStoreFactory.class); + + /** + * Creates a ConfigStore instance based on configuration. + * + * @param pluginManager the plugin manager + * @param configStoreType the type of ConfigStore to create + * @param extensionConfig optional configuration for the store + * @return a ConfigStore instance + * @throws TikaConfigException if the store cannot be created + */ + static ConfigStore createConfigStore(PluginManager pluginManager, String configStoreType, + ExtensionConfig extensionConfig) + throws TikaConfigException { + if (configStoreType == null || configStoreType.isEmpty() || "memory".equalsIgnoreCase(configStoreType)) { + LOG.info("Creating InMemoryConfigStore"); + InMemoryConfigStore store = new InMemoryConfigStore(); + if (extensionConfig != null) { + store.setExtensionConfig(extensionConfig); + } + return store; + } + + Map<String, ConfigStoreFactory> factoryMap = loadAllConfigStoreFactoryExtensions(pluginManager); + + ConfigStoreFactory factory = factoryMap.get(configStoreType); + if (factory != null) { + return configStoreByConfigByFactoryName(configStoreType, extensionConfig, factory); + } + return configStoreByFullyQualifiedClassName(configStoreType, extensionConfig, factoryMap); + } + + private static ConfigStore configStoreByConfigByFactoryName(String configStoreType, ExtensionConfig extensionConfig, ConfigStoreFactory factory) throws TikaConfigException { + LOG.info("Creating ConfigStore using factory: {}", factory.getName()); + try { + ExtensionConfig config = extensionConfig != null ? extensionConfig : + new ExtensionConfig(configStoreType, configStoreType, "{}"); + return factory.buildExtension(config); + } catch (IOException e) { + throw new TikaConfigException("Failed to create ConfigStore: " + configStoreType, e); + } + } + + private static ConfigStore configStoreByFullyQualifiedClassName(String configStoreType, + ExtensionConfig extensionConfig, + Map<String, ConfigStoreFactory> factoryMap) throws TikaConfigException { + try { + LOG.info("Creating ConfigStore from class: {}", configStoreType); + Class<?> storeClass = Class.forName(configStoreType); + if (!ConfigStore.class.isAssignableFrom(storeClass)) { + throw new TikaConfigException( + "Class " + configStoreType + " does not implement ConfigStore interface"); + } + ConfigStore store = (ConfigStore) storeClass.getDeclaredConstructor().newInstance(); + if (extensionConfig != null) { + ((InMemoryConfigStore) store).setExtensionConfig(extensionConfig); + } Review Comment: This cast to InMemoryConfigStore will fail for any ConfigStore implementation that is not InMemoryConfigStore. Since this method is meant to handle fully-qualified class names that could instantiate any ConfigStore implementation, this is a bug. The code should check if the store is an instance of InMemoryConfigStore before casting, or use reflection to check if the store has a setExtensionConfig method and call it dynamically. ########## tika-pipes/tika-pipes-plugins/tika-pipes-ignite/pom.xml: ########## @@ -0,0 +1,187 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> + <parent> + <artifactId>tika-pipes-plugins</artifactId> + <groupId>org.apache.tika</groupId> + <version>4.0.0-SNAPSHOT</version> + </parent> + <modelVersion>4.0.0</modelVersion> + + <artifactId>tika-pipes-ignite</artifactId> + <name>Apache Tika Pipes Apache Ignite</name> + <packaging>jar</packaging> + + <properties> + <ignite.version>2.16.0</ignite.version> + <h2.version>1.4.197</h2.version> Review Comment: The H2 database version 1.4.197 specified here differs from the parent POM version 2.4.240. This inconsistency can lead to dependency conflicts and unexpected behavior. Consider using the version from the parent POM by removing this property override and relying on the inherited h2.version property. ```suggestion ``` ########## tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStore.java: ########## @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.tika.pipes.ignite; + +import java.util.Set; + +import org.apache.ignite.Ignite; +import org.apache.ignite.IgniteCache; +import org.apache.ignite.Ignition; +import org.apache.ignite.cache.CacheMode; +import org.apache.ignite.configuration.CacheConfiguration; +import org.apache.ignite.configuration.IgniteConfiguration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.tika.exception.TikaConfigException; +import org.apache.tika.pipes.core.config.ConfigStore; +import org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig; +import org.apache.tika.plugins.ExtensionConfig; + +/** + * Apache Ignite-based implementation of {@link ConfigStore}. + * Provides distributed configuration storage for Tika Pipes clustering. + * <p> + * This implementation is thread-safe and suitable for multi-instance deployments + * where configurations need to be shared across multiple servers. + * <p> + * Configuration options: + * <ul> + * <li>cacheName - Name of the Ignite cache (default: "tika-config-store")</li> + * <li>cacheMode - Cache replication mode: PARTITIONED or REPLICATED (default: REPLICATED)</li> + * <li>igniteInstanceName - Name of the Ignite instance (default: "TikaIgniteConfigStore")</li> + * </ul> + */ +public class IgniteConfigStore implements ConfigStore { + + private static final Logger LOG = LoggerFactory.getLogger(IgniteConfigStore.class); + private static final String DEFAULT_CACHE_NAME = "tika-config-store"; + private static final String DEFAULT_INSTANCE_NAME = "TikaIgniteConfigStore"; + + private Ignite ignite; + private IgniteCache<String, ExtensionConfigDTO> cache; + private String cacheName = DEFAULT_CACHE_NAME; + private CacheMode cacheMode = CacheMode.REPLICATED; + private String igniteInstanceName = DEFAULT_INSTANCE_NAME; + private boolean autoClose = true; + private ExtensionConfig extensionConfig; + + public IgniteConfigStore() { + } + + public IgniteConfigStore(ExtensionConfig extensionConfig) throws TikaConfigException { + this.extensionConfig = extensionConfig; + + IgniteConfigStoreConfig config = IgniteConfigStoreConfig.load(extensionConfig.json()); + this.cacheName = config.getCacheName(); + this.cacheMode = config.getCacheModeEnum(); + this.igniteInstanceName = config.getIgniteInstanceName(); + this.autoClose = config.isAutoClose(); + } + + public IgniteConfigStore(String cacheName) { + this.cacheName = cacheName; + } + + @Override + public ExtensionConfig getExtensionConfig() { + return extensionConfig; + } + + @Override + public void init() throws Exception { + if (ignite != null) { + LOG.warn("IgniteConfigStore already initialized"); + return; + } Review Comment: The init method allows reinitialization after close has been called. After calling close(), the ignite field is set to null, but calling init() again would succeed. Consider adding a check to prevent reinitialization after close, or document that this behavior is intentional. ########## tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/IgniteConfigStore.java: ########## @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.tika.pipes.ignite; + +import java.util.Set; + +import org.apache.ignite.Ignite; +import org.apache.ignite.IgniteCache; +import org.apache.ignite.Ignition; +import org.apache.ignite.cache.CacheMode; +import org.apache.ignite.configuration.CacheConfiguration; +import org.apache.ignite.configuration.IgniteConfiguration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.tika.exception.TikaConfigException; +import org.apache.tika.pipes.core.config.ConfigStore; +import org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig; +import org.apache.tika.plugins.ExtensionConfig; + +/** + * Apache Ignite-based implementation of {@link ConfigStore}. + * Provides distributed configuration storage for Tika Pipes clustering. + * <p> + * This implementation is thread-safe and suitable for multi-instance deployments + * where configurations need to be shared across multiple servers. + * <p> + * Configuration options: + * <ul> + * <li>cacheName - Name of the Ignite cache (default: "tika-config-store")</li> + * <li>cacheMode - Cache replication mode: PARTITIONED or REPLICATED (default: REPLICATED)</li> + * <li>igniteInstanceName - Name of the Ignite instance (default: "TikaIgniteConfigStore")</li> + * </ul> + */ +public class IgniteConfigStore implements ConfigStore { + + private static final Logger LOG = LoggerFactory.getLogger(IgniteConfigStore.class); + private static final String DEFAULT_CACHE_NAME = "tika-config-store"; + private static final String DEFAULT_INSTANCE_NAME = "TikaIgniteConfigStore"; + + private Ignite ignite; + private IgniteCache<String, ExtensionConfigDTO> cache; + private String cacheName = DEFAULT_CACHE_NAME; + private CacheMode cacheMode = CacheMode.REPLICATED; + private String igniteInstanceName = DEFAULT_INSTANCE_NAME; + private boolean autoClose = true; + private ExtensionConfig extensionConfig; + + public IgniteConfigStore() { + } + + public IgniteConfigStore(ExtensionConfig extensionConfig) throws TikaConfigException { + this.extensionConfig = extensionConfig; + + IgniteConfigStoreConfig config = IgniteConfigStoreConfig.load(extensionConfig.json()); + this.cacheName = config.getCacheName(); + this.cacheMode = config.getCacheModeEnum(); + this.igniteInstanceName = config.getIgniteInstanceName(); + this.autoClose = config.isAutoClose(); + } + + public IgniteConfigStore(String cacheName) { + this.cacheName = cacheName; + } + + @Override + public ExtensionConfig getExtensionConfig() { + return extensionConfig; + } + + @Override + public void init() throws Exception { + if (ignite != null) { + LOG.warn("IgniteConfigStore already initialized"); + return; + } + + LOG.info("Initializing IgniteConfigStore with cache: {}, mode: {}, instance: {}", + cacheName, cacheMode, igniteInstanceName); + + IgniteConfiguration cfg = new IgniteConfiguration(); + cfg.setIgniteInstanceName(igniteInstanceName); + cfg.setClientMode(false); + + ignite = Ignition.start(cfg); + + CacheConfiguration<String, ExtensionConfigDTO> cacheCfg = new CacheConfiguration<>(cacheName); + cacheCfg.setCacheMode(cacheMode); + cacheCfg.setBackups(cacheMode == CacheMode.PARTITIONED ? 1 : 0); + + cache = ignite.getOrCreateCache(cacheCfg); + LOG.info("IgniteConfigStore initialized successfully"); + } + + @Override + public void put(String id, ExtensionConfig config) { + if (cache == null) { + throw new IllegalStateException("IgniteConfigStore not initialized. Call init() first."); + } + cache.put(id, new ExtensionConfigDTO(config)); + } + + @Override + public ExtensionConfig get(String id) { + if (cache == null) { + throw new IllegalStateException("IgniteConfigStore not initialized. Call init() first."); + } + ExtensionConfigDTO dto = cache.get(id); + return dto != null ? dto.toExtensionConfig() : null; + } + + @Override + public boolean containsKey(String id) { + if (cache == null) { + throw new IllegalStateException("IgniteConfigStore not initialized. Call init() first."); + } + return cache.containsKey(id); + } + + @Override + public Set<String> keySet() { + if (cache == null) { + throw new IllegalStateException("IgniteConfigStore not initialized. Call init() first."); + } + return Set.copyOf(cache.query(new org.apache.ignite.cache.query.ScanQuery<String, ExtensionConfigDTO>()) + .getAll() + .stream() + .map(entry -> entry.getKey()) Review Comment: The lambda can be simplified by removing the redundant map operation. Instead of .map(entry -> entry.getKey()), use the method reference .map(javax.cache.Cache.Entry::getKey) for better readability and conciseness. ```suggestion .map(javax.cache.Cache.Entry::getKey) ``` ########## tika-pipes/tika-pipes-core/src/main/java/org/apache/tika/pipes/core/config/ConfigStoreFactory.java: ########## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.tika.pipes.core.config; + +import java.io.IOException; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.pf4j.PluginManager; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.tika.exception.TikaConfigException; +import org.apache.tika.plugins.ExtensionConfig; +import org.apache.tika.plugins.TikaExtensionFactory; + +/** + * Factory interface for creating ConfigStore instances. + * Implementations should be annotated with @Extension to be discovered by PF4J. + */ +public interface ConfigStoreFactory extends TikaExtensionFactory<ConfigStore> { + + Logger LOG = LoggerFactory.getLogger(ConfigStoreFactory.class); + + /** + * Creates a ConfigStore instance based on configuration. + * + * @param pluginManager the plugin manager + * @param configStoreType the type of ConfigStore to create + * @param extensionConfig optional configuration for the store + * @return a ConfigStore instance + * @throws TikaConfigException if the store cannot be created + */ + static ConfigStore createConfigStore(PluginManager pluginManager, String configStoreType, + ExtensionConfig extensionConfig) + throws TikaConfigException { + if (configStoreType == null || configStoreType.isEmpty() || "memory".equalsIgnoreCase(configStoreType)) { + LOG.info("Creating InMemoryConfigStore"); + InMemoryConfigStore store = new InMemoryConfigStore(); + if (extensionConfig != null) { + store.setExtensionConfig(extensionConfig); + } + return store; + } + + Map<String, ConfigStoreFactory> factoryMap = loadAllConfigStoreFactoryExtensions(pluginManager); + + ConfigStoreFactory factory = factoryMap.get(configStoreType); + if (factory != null) { + return configStoreByConfigByFactoryName(configStoreType, extensionConfig, factory); + } + return configStoreByFullyQualifiedClassName(configStoreType, extensionConfig, factoryMap); + } + + private static ConfigStore configStoreByConfigByFactoryName(String configStoreType, ExtensionConfig extensionConfig, ConfigStoreFactory factory) throws TikaConfigException { + LOG.info("Creating ConfigStore using factory: {}", factory.getName()); + try { + ExtensionConfig config = extensionConfig != null ? extensionConfig : + new ExtensionConfig(configStoreType, configStoreType, "{}"); + return factory.buildExtension(config); + } catch (IOException e) { + throw new TikaConfigException("Failed to create ConfigStore: " + configStoreType, e); + } + } + + private static ConfigStore configStoreByFullyQualifiedClassName(String configStoreType, + ExtensionConfig extensionConfig, + Map<String, ConfigStoreFactory> factoryMap) throws TikaConfigException { + try { + LOG.info("Creating ConfigStore from class: {}", configStoreType); + Class<?> storeClass = Class.forName(configStoreType); + if (!ConfigStore.class.isAssignableFrom(storeClass)) { + throw new TikaConfigException( + "Class " + configStoreType + " does not implement ConfigStore interface"); + } + ConfigStore store = (ConfigStore) storeClass.getDeclaredConstructor().newInstance(); + if (extensionConfig != null) { + ((InMemoryConfigStore) store).setExtensionConfig(extensionConfig); + } + return store; + } catch (ClassNotFoundException e) { + throw new TikaConfigException( + "Unknown ConfigStore type: " + configStoreType + + ". Available types: memory, " + String.join(", ", factoryMap.keySet()), e); + } catch (Exception e) { + throw new TikaConfigException("Failed to instantiate ConfigStore: " + configStoreType, e); + } + } + + private static Map<String, ConfigStoreFactory> loadAllConfigStoreFactoryExtensions(PluginManager pluginManager) { + List<ConfigStoreFactory> factories = pluginManager.getExtensions(ConfigStoreFactory.class); + Map<String, ConfigStoreFactory> factoryMap = new HashMap<>(); + for (ConfigStoreFactory factory : factories) { + factoryMap.put(factory.getName(), factory); + } + return factoryMap; + } +} Review Comment: The new ConfigStoreFactory class and its static createConfigStore method lack test coverage. Given that tests exist for ConfigStore implementations in ConfigStoreTest.java, tests should be added for the factory methods to ensure proper creation of ConfigStore instances, error handling, and the fully-qualified class name fallback mechanism. ########## tika-pipes/tika-pipes-plugins/tika-pipes-ignite/src/main/java/org/apache/tika/pipes/ignite/config/IgniteConfigStoreConfig.java: ########## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.tika.pipes.ignite.config; + +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.ignite.cache.CacheMode; + +import org.apache.tika.exception.TikaConfigException; + +/** + * Configuration for IgniteConfigStore. + * + * Example JSON configuration: + * <pre> + * { + * "cacheName": "my-tika-cache", + * "cacheMode": "REPLICATED", + * "igniteInstanceName": "MyTikaCluster", + * "autoClose": true + * } + * </pre> + */ +public class IgniteConfigStoreConfig { + + private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper(); + + public static IgniteConfigStoreConfig load(final String json) throws TikaConfigException { + try { + if (json == null || json.trim().isEmpty() || "{}".equals(json.trim())) { + return new IgniteConfigStoreConfig(); + } + return OBJECT_MAPPER.readValue(json, IgniteConfigStoreConfig.class); + } catch (JsonProcessingException e) { + throw new TikaConfigException("Failed to parse IgniteConfigStoreConfig from JSON", e); + } + } + + private String cacheName = "tika-config-store"; + private String cacheMode = "REPLICATED"; + private String igniteInstanceName = "TikaIgniteConfigStore"; + private boolean autoClose = true; + + public String getCacheName() { + return cacheName; + } + + public IgniteConfigStoreConfig setCacheName(String cacheName) { + this.cacheName = cacheName; + return this; + } + + public String getCacheMode() { + return cacheMode; + } + + public IgniteConfigStoreConfig setCacheMode(String cacheMode) { + this.cacheMode = cacheMode; + return this; + } + + public CacheMode getCacheModeEnum() { + if ("PARTITIONED".equalsIgnoreCase(cacheMode)) { + return CacheMode.PARTITIONED; + } + return CacheMode.REPLICATED; Review Comment: The getCacheModeEnum method only handles PARTITIONED and REPLICATED cache modes, defaulting to REPLICATED for any other value. This could silently ignore invalid cache mode values like typos. Consider validating the cacheMode value and throwing an exception for invalid modes, or at minimum logging a warning when falling back to the default. ```suggestion if (cacheMode == null || cacheMode.trim().isEmpty()) { // Preserve default behavior: fall back to REPLICATED when not specified return CacheMode.REPLICATED; } if ("PARTITIONED".equalsIgnoreCase(cacheMode)) { return CacheMode.PARTITIONED; } if ("REPLICATED".equalsIgnoreCase(cacheMode)) { return CacheMode.REPLICATED; } throw new IllegalArgumentException( "Unsupported cacheMode: '" + cacheMode + "'. Supported values are PARTITIONED and REPLICATED."); ``` ########## tika-grpc/src/test/resources/tika-config-ignite.json: ########## @@ -0,0 +1,29 @@ +{ + "pipes": { + "configStoreType": "ignite", + "configStoreParams": { + "cacheName": "my-tika-cache", + "cacheMode": "REPLICATED", + "igniteInstanceName": "MyTikaCluster", + "autoClose": true + } Review Comment: The configStoreParams is defined as a JSON object here, but in dev-tika-config.json it's defined as a JSON string. This inconsistency may cause confusion. The PipesConfig expects configStoreParams to be a String (as shown in the code), so this JSON object format will be serialized to a string when this config is loaded. Consider documenting which format is expected or providing consistent examples. ```suggestion "configStoreParams": "{\"cacheName\":\"my-tika-cache\",\"cacheMode\":\"REPLICATED\",\"igniteInstanceName\":\"MyTikaCluster\",\"autoClose\":true}", ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
