Tim Allison created TIKA-4539:
---------------------------------
Summary: Lombok and jackson for configs in 4.x?
Key: TIKA-4539
URL: https://issues.apache.org/jira/browse/TIKA-4539
Project: Tika
Issue Type: Improvement
Reporter: Tim Allison
For tika-server and grpc and some other use cases, it would useful to have an
"initialization" config for parsers and other things and then an "update"
capability.
I hacked something out for the PDFParserConfig where we literally store the
method names for what was updated. This is really unpleasant.
If we're willing to use Lombok (had to install a plugin in Intellij, but it was
easy), we could create a ConfigBase like so:
{noformat}
public class ConfigBase {
static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
@JsonIgnore
private byte[] baseJson;
protected <T extends ConfigBase> T update(T basis, String update, Class<T>
clazz) throws IOException {
if (baseJson == null) {
baseJson = OBJECT_MAPPER.writeValueAsBytes(basis);
}
T base = OBJECT_MAPPER.readValue(baseJson, clazz);
return OBJECT_MAPPER
.readerForUpdating(base)
.readValue(update);
}
}{noformat}
Then for the PDFParserConfig. There are a lot of annotations, but we don't need
to create constructors or setters/getters.
{noformat}
@EqualsAndHashCode(callSuper = true)
@Data
@Builder
@AllArgsConstructor
@NoArgsConstructor
public class PDFParserConfig extends ConfigBase {
@Builder.Default
private String color = "blue";
@Builder.Default
private int length = -5;
@Builder.Default
private int width = 10;
public PDFParserConfig cloneAndUpdate(String json) throws IOException {
return update(this, json, PDFParserConfig.class);
}
} {noformat}
Then we can see results:
{noformat}
@Test
public void testOne() throws Exception {
System.out.println("default: " + new PDFParserConfig());
PDFParserConfig basis = new
PDFParserConfig.PDFParserConfigBuilder().color("white").build();
String json = """
{
"color":"green"
}
""";
System.out.println("BASIS before: " + basis);
System.out.println(basis.cloneAndUpdate(json));
System.out.println("BASIS after: " + basis);
} {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)