Caleb Cushing created TIKA-3429:
-----------------------------------
Summary: Performance problems partially caused by tika eagerly
loading configuration
Key: TIKA-3429
URL: https://issues.apache.org/jira/browse/TIKA-3429
Project: Tika
Issue Type: New Feature
Reporter: Caleb Cushing
referencing
https://github.com/spring-projects/spring-boot/issues/26709#issuecomment-851953515
{quote}
the tika configuration (eagerly loading a 7K lines XML file)
{quote}
Here's the text of that issue
I'm not sure the problem is spring boot, but I'm having problems finding it.
The Jar is currently taking 3 seconds (9 if I live out tiered) to run on my
system. Just to error out due to missing options and do nothing.
https://github.com/xenoterracide/brix/tree/8e3d86bcf773e564cc24b51572b0bbd8bb60b73f
{code}
time java -Xverify:none -XX:TieredStopAtLevel=1 -jar
modules/app/build/libs/app-0.1.0.jar
# brix -> ccushing/copy-5-1
Missing required parameters: '<language>', '<moduleType>', '<project>'
Usage: <main class> [--repo=<repo>] [--workdir=<workdir>] <language>
<moduleType> <project> [COMMAND]
<language> The programming language you're generating code
for. Directory under --dir
<moduleType> The type of code you're generating e.g controller,
also the name of the config file without the
extension.
<project> The name of the project you're generating code for.
The name of the module to be created within the
project.
--repo=<repo> Repository path from the current working directory.
Templates and configs are looked up relative to
here. If the config isn't found here, then we
will search ~/.config/brix
--workdir=<workdir> The working directory you want your destination
paths to be relative to. Defaults to current
working directory
Default:
Commands:
run
java -Xverify:none -XX:TieredStopAtLevel=1 -jar 3.15s user 0.26s system 142%
cpu 2.386 total
{code}
since it's a CLI app lazy init isn't helpful. This is worded like a question
(that really would not be suitable for stackoverflow, I hate that SO is the
support forum for things now, it's terrible because of the attitude of people
that the objective is not to help people, also it's bad at getting answers for
harder problems, spring should get a discourse or something again), but I also
know I had a tika CLI app in the past that loaded in less than 1s without
Tiered, so I'm also concerned it's a spring boot bug. I'm going to connect a
profiler later to see what I can find, but I'm not sure that will do it.
{code}
Fedora 33
5.11.16-200.fc33.x86_64
14:08:34 up 3 days, 2:04, 1 user, load average: 0.79, 1.10, 1.66
total used free shared buff/cache available
Mem: 15G 11G 1.0G 1.4G 3.0G 2.3G
Swap: 12G 1.5G 10G
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)