[
https://issues.apache.org/jira/browse/TIKA-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355195#comment-17355195
]
Caleb Cushing edited comment on TIKA-3429 at 6/1/21, 4:02 PM:
--------------------------------------------------------------
the 9 seconds has something to do with Fedora Linux, I have temporarily
uninstalled that due to other cross-platform issues, with the intent to install
Manjaro Linux later. Depending on the results there I will probably open issues
elsewhere.
This is death by 1000 cuts, currently, on master things run at about 3 seconds
(on windows), which is still too slow for a CLI app, that's not doing anything,
IMO. about 1000ms is from Picocli, + another 800ms from Tika, it adds up.
Though as I mentioned, I have started lazy loading the tika objects, so I'm
only dealing with about 2.5 seconds in my `ccushing/performance-9` branch, I
may try that lazy loading in a different way though, as I believe the proxies
are also creating some time vs `ObjectFactory`
was (Author: xenoterracide):
the 9 seconds has something to do with Fedora Linux, I have temporarily
uninstalled that due to other cross-platform issues, with the intent to install
Manjaro Linux later. Depending on the results there I will probably open issues
elsewhere.
This is death by 1000 cuts, currently, on master things run at about 3 seconds,
which is still too slow for a CLI app, that's not doing anything, IMO. about
1000ms is from Picocli, + another 800ms from Tika, it adds up. Though as I
mentioned, I have started lazy loading the tika objects, so I'm only dealing
with about 2.5 seconds in my `ccushing/performance-9` branch, I may try that
lazy loading in a different way though, as I believe the proxies are also
creating some time vs `ObjectFactory`
> Performance problems partially caused by tika eagerly loading configuration
> ---------------------------------------------------------------------------
>
> Key: TIKA-3429
> URL: https://issues.apache.org/jira/browse/TIKA-3429
> Project: Tika
> Issue Type: New Feature
> Reporter: Caleb Cushing
> Priority: Major
>
> referencing
> https://github.com/spring-projects/spring-boot/issues/26709#issuecomment-851953515
> {quote}
> the tika configuration (eagerly loading a 7K lines XML file)
> {quote}
> Here's the text of that issue
> I'm not sure the problem is spring boot, but I'm having problems finding it.
> The Jar is currently taking 3 seconds (9 if I live out tiered) to run on my
> system. Just to error out due to missing options and do nothing.
> https://github.com/xenoterracide/brix/tree/8e3d86bcf773e564cc24b51572b0bbd8bb60b73f
> {code}
> time java -Xverify:none -XX:TieredStopAtLevel=1 -jar
> modules/app/build/libs/app-0.1.0.jar
> # brix -> ccushing/copy-5-1
> Missing required parameters: '<language>', '<moduleType>', '<project>'
> Usage: <main class> [--repo=<repo>] [--workdir=<workdir>] <language>
> <moduleType> <project> [COMMAND]
> <language> The programming language you're generating code
> for. Directory under --dir
> <moduleType> The type of code you're generating e.g controller,
> also the name of the config file without the
> extension.
> <project> The name of the project you're generating code
> for.
> The name of the module to be created within the
> project.
> --repo=<repo> Repository path from the current working
> directory.
> Templates and configs are looked up relative to
> here. If the config isn't found here, then we
> will search ~/.config/brix
> --workdir=<workdir> The working directory you want your destination
> paths to be relative to. Defaults to current
> working directory
> Default:
> Commands:
> run
> java -Xverify:none -XX:TieredStopAtLevel=1 -jar 3.15s user 0.26s system
> 142% cpu 2.386 total
> {code}
> since it's a CLI app lazy init isn't helpful. This is worded like a question
> (that really would not be suitable for stackoverflow, I hate that SO is the
> support forum for things now, it's terrible because of the attitude of people
> that the objective is not to help people, also it's bad at getting answers
> for harder problems, spring should get a discourse or something again), but I
> also know I had a tika CLI app in the past that loaded in less than 1s
> without Tiered, so I'm also concerned it's a spring boot bug. I'm going to
> connect a profiler later to see what I can find, but I'm not sure that will
> do it.
> {code}
> Fedora 33
> 5.11.16-200.fc33.x86_64
> 14:08:34 up 3 days, 2:04, 1 user, load average: 0.79, 1.10, 1.66
> total used free shared buff/cache
> available
> Mem: 15G 11G 1.0G 1.4G 3.0G
> 2.3G
> Swap: 12G 1.5G 10G
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)