Le 05/11/2012 11:26, sebb a ecrit :
On 3 November 2012 19:23, Milamber<[email protected]> wrote:
Hello,
Currently, I work to add Apache Tika 1.2 [1] in JMeter to improve functional
tests.
With Tika, you can extract the text form various documents, like MS Office
(Word, Excel, PowerPoint 97-2003, 2007-2010 (openxml), OpenOffice (writer,
calc, impress), HTML, Gz, jar/zip files (list of content), and some
"multimedia" files like mp3, mp4, flv, etc.
In JMeter, Tika can be used by the View Results Tree to view the text data
of this files, Regular extractor to catch some text from this files and
Response assertion to assert on the data.
The inconvenient is: Apache Tika requires a big jar (25Mb) or a lot of jar
files (see below). With all jars in the binary package, the new size (for
tgz) is 45 Mb (JMeter 2.8 tgz : 23Mb)
The question: are you agree to add Tika (and new capability to "extract text
from Document") in JMeter with the new binary size?
Secondary question: what the good way? : 1/ Add only tika-app.jar (which
include all dependencies) [2], or 2/ Add several jar files (tika-core,
tika-parser, etc + dependencies) [3]
I'm concerned that using Tika would double the size of JMeter.
Although the extra features would be useful, I suspect that most test
cases won't need the extra functionality.
Would it be possible to make the Tika jars optional?
i.e. add the functionality, but if the jars are not present it is disabled.
Yes seems possible via a dynamic class control / loading
If we accept that developers must download Tika, then it should be
easy enough to structure the add-on so that JMeter can fail gracefully
if the jars are missing.
But ideally developers would not need to download all the jars either.
Currently, to compile the "tika" elements, we must have only these jars :
tika-core.jar
tika-parsers.jar
To the binary release, we needs had these jars (full list):
apache-mime4j-core.jar
apache-mime4j-dom.jar
asm.jar
aspectjrt.jar
boilerpipe.jar
commons-compress.jar
dom4j.jar
fontbox.jar
geronimo-stax-api_1.0_spec.jar
gson.jar
isoparser.jar
jempbox.jar
juniversalchardet.jar
log4j.jar
metadata-extractor.jar
netcdf.jar
pdfbox.jar
poi-ooxml-schemas.jar
poi-ooxml.jar
poi-scratchpad.jar
poi.jar
rome.jar
slf4j-api.jar
slf4j-log4j12.jar
tagsoup.jar
tika-core.jar
tika-parsers.jar
tika-xmp.jar
vorbis-java-core.jar
vorbis-java-tika.jar
xmlbeans.jar
xmpcore.jar
xz.jar
Or only the tika-app.jar (25Mb)
So, we can add the "tika" functionalities with dynamic class loading,
add some warning messages to indicate the download of tika-app.jar if
you want have the tika behavior
For View Results Tree, when the "Document" combo list is choosed: a
message in Response data to indicate the missing tika-app.jar (with some
indication where download it)
For RegExp and Response Assertion, if missing tika-app.jar, a warning
dialog to show the message when the radio button "Response as a
Document" is selected
And in all cases, a warning message in jmeter.log.