On Sun, 21 Apr 2024, Michael Wechner wrote:
Thanks for the pointer to the Generative Tooling rules, which I was not
aware of so far.
At the bottom it says, that the ASF does not tell developers what tools
to use, but I think it would be useful to useful to have some concrete
examples, which
On Fri, 19 Apr 2024, Nicholas DiPiazza wrote:
Can I get an open source license for GitHub copilot?
I've not heard of anyone offering that. Some of the open and open-ish
models are quite good on coding tasks, though you'd need to hop to a
different interface to ask for help (unlike the
On Thu, 11 Apr 2024, Tim Allison wrote:
I just excluded joda-time because of this: CVE-2024-23080
https://nvd.nist.gov/vuln/detail/CVE-2024-23080
This is an NPE in joda-time version 2.12.5. That's two versions before the
current...is it actually still in there. And more importantly, an NPE is
On Mon, 8 Apr 2024, Tim Allison wrote:
Not sure we should jump on the bandwagon, but anything we can do to
support smart chunking would benefit us.
Could just be more integrations with parsers that turn out to be useful. I
haven’t had much joy with some. Here’s one that I haven’t evaluated
[
https://issues.apache.org/jira/browse/TIKA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830867#comment-17830867
]
Nick Burch commented on TIKA-4223:
--
A lot of the early file extension allocations were taken from
[
https://issues.apache.org/jira/browse/TIKA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827017#comment-17827017
]
Nick Burch commented on TIKA-4210:
--
The attached file seems to be an RTF file. I'm not sure what a "
[
https://issues.apache.org/jira/browse/TIKA-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824965#comment-17824965
]
Nick Burch commented on TIKA-4208:
--
I would expect that the json output version would need a bit more
[
https://issues.apache.org/jira/browse/TIKA-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824874#comment-17824874
]
Nick Burch commented on TIKA-4208:
--
How much heap size do you have allocated?
The error suggests
[
https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816788#comment-17816788
]
Nick Burch commented on TIKA-3784:
--
>From [https://datatracker.ietf.org/doc/rfc7292/] it looks l
[
https://issues.apache.org/jira/browse/TIKA-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787608#comment-17787608
]
Nick Burch commented on TIKA-4148:
--
For detection of the OLE2 based files, we don't need to find unique
[
https://issues.apache.org/jira/browse/TIKA-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch updated TIKA-4119:
-
Component/s: mime
> Return media type "text/javascript" instead of "application/javas
[
https://issues.apache.org/jira/browse/TIKA-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch updated TIKA-4119:
-
Labels: tika-3x (was: )
> Return media type "text/javascript" instead of "appl
[
https://issues.apache.org/jira/browse/TIKA-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759921#comment-17759921
]
Nick Burch commented on TIKA-4119:
--
I wonder if this is a big enough change around Detection that we
[
https://issues.apache.org/jira/browse/TIKA-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750344#comment-17750344
]
Nick Burch commented on TIKA-4062:
--
Between holidays and the length of time needed for regression runs
[
https://issues.apache.org/jira/browse/TIKA-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748454#comment-17748454
]
Nick Burch commented on TIKA-4064:
--
Depends if anyone else on the PMC has the time to be release manager
[
https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748452#comment-17748452
]
Nick Burch commented on TIKA-3948:
--
[~solomax] I think the first task is to identify any other areas
[
https://issues.apache.org/jira/browse/TIKA-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741578#comment-17741578
]
Nick Burch commented on TIKA-4098:
--
The more bytes beyond the start we check for the PDF marker, the more
[
https://issues.apache.org/jira/browse/TIKA-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730728#comment-17730728
]
Nick Burch commented on TIKA-4060:
--
I'm a muppet... had forgotten to escape the hex characters
[
https://issues.apache.org/jira/browse/TIKA-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch resolved TIKA-4060.
--
Fix Version/s: 2.8.1
Resolution: Fixed
> Add magic to audio/aac in tika-mimetypes.
[
https://issues.apache.org/jira/browse/TIKA-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730649#comment-17730649
]
Nick Burch commented on TIKA-4060:
--
0x494443 is the string ID3, which I think ought to be at the start
[
https://issues.apache.org/jira/browse/TIKA-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730304#comment-17730304
]
Nick Burch commented on TIKA-4060:
--
I have created some small test AAC files using ffmpeg, and then had
[
https://issues.apache.org/jira/browse/TIKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728992#comment-17728992
]
Nick Burch commented on TIKA-4051:
--
Last time I asked the MPXJ project they weren't interested
[
https://issues.apache.org/jira/browse/TIKA-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725561#comment-17725561
]
Nick Burch commented on TIKA-3999:
--
Oh, this brings back memories... good memories :)
Unless we can
[
https://issues.apache.org/jira/browse/TIKA-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724302#comment-17724302
]
Nick Burch commented on TIKA-4045:
--
I guess this could also apply for other row-based formats like SQLite
[
https://issues.apache.org/jira/browse/TIKA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17718674#comment-17718674
]
Nick Burch commented on TIKA-4025:
--
Would a video metadata specification's frame count be a better home
On Mon, 13 Mar 2023, Nicholas DiPiazza wrote:
can we require that the request form for creating a jira account
contains the first issue they would like to create?
You'd need to ask on users@infra about that, it's an ASF wide thing (to
avoid a huge spam problem) and not something our project
[
https://issues.apache.org/jira/browse/TIKA-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693140#comment-17693140
]
Nick Burch commented on TIKA-3981:
--
Is this happening for all executables on your machine, or just some
[
https://issues.apache.org/jira/browse/TIKA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689199#comment-17689199
]
Nick Burch commented on TIKA-3973:
--
If you only care about container-aware detection for Ogg based
[
https://issues.apache.org/jira/browse/TIKA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689176#comment-17689176
]
Nick Burch commented on TIKA-3973:
--
For all container formats you want {{tika-parsers}} or {{tika-parsers
[
https://issues.apache.org/jira/browse/TIKA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689161#comment-17689161
]
Nick Burch edited comment on TIKA-3973 at 2/15/23 2:38 PM:
---
For container-based
[
https://issues.apache.org/jira/browse/TIKA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689161#comment-17689161
]
Nick Burch commented on TIKA-3973:
--
For container-based detection (such as the Ogg container format), you
[
https://issues.apache.org/jira/browse/TIKA-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17682352#comment-17682352
]
Nick Burch commented on TIKA-3960:
--
If possible, please include a small test file and update
{{tika
[
https://issues.apache.org/jira/browse/TIKA-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677364#comment-17677364
]
Nick Burch commented on TIKA-3703:
--
I guess we could include a data package metadata file to better
[
https://issues.apache.org/jira/browse/TIKA-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677326#comment-17677326
]
Nick Burch commented on TIKA-3703:
--
A zip file gives you compression, and most clients won't accidentally
[
https://issues.apache.org/jira/browse/TIKA-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17675914#comment-17675914
]
Nick Burch commented on TIKA-3955:
--
The Tika App is intended as a "batteries included" stan
[
https://issues.apache.org/jira/browse/TIKA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656060#comment-17656060
]
Nick Burch commented on TIKA-3952:
--
Is the PDF a scan? Are you doing OCR?
> Content misma
[
https://issues.apache.org/jira/browse/TIKA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656049#comment-17656049
]
Nick Burch commented on TIKA-3952:
--
Can you try following the steps in
[https://cwiki.apache.org
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17627638#comment-17627638
]
Nick Burch commented on TIKA-2536:
--
We can only depend on versions in maven central, we can't depend
[
https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620633#comment-17620633
]
Nick Burch commented on TIKA-3890:
--
DOCX files are compressed XML. Text compresses very well. Already
[
https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620610#comment-17620610
]
Nick Burch commented on TIKA-3890:
--
The only way to be sure of how many pages are in a Word document
On Thu, 6 Oct 2022, Tim Allison wrote:
Happy to chat. Please put them in touch.
Excellent, thanks Tim!
Other than your past talks, have we got any info (eg on the wiki?) about
how to run the regression corpus?
I've been really impressed with what the POI team has done migrating
from ant
On Wed, 5 Oct 2022, Nicholas DiPiazza wrote:
Are they offering the Gradle Build Cache stuff free for apache projects?
There's an announcement at ApacheCon in about an hour... I think the Infra
team are still working out the details on how it'll all work.
However, there's an additional offer
On Wed, 5 Oct 2022, Oleg Tikhonov wrote:
Honestly I am trying to port our project to gradle. But it goes not well.
It is good idea. Is some folk can help, we can do it together.
Apparently Gradle Enterprise works with both Gradle and Maven! So we don't
even have to change our build -
Hi All
At ApacheCon this week, a Bob and myself ended up chatting with the folks
from Gradle, who are keen to help ASF projects, and are discussing with
the Infra team.
The easier bit - they think they might be able to help speed up our maven
build, especially the running of tests. Anyone
On Sat, 24 Sep 2022, Tim Allison wrote:
Electron and which framework?
I'd say there's two choice mechanisms.
One is to pick whatever most excites you / is likely to look best on your
next funding application, and say that since you're doing most of the
initial work you can choose!
The
On Sat, 24 Sep 2022, Tim Allison wrote:
Given that this is greenfields, should I start w javafx or stick w swing
or is there another framework I should try?
Give the Tika Server an optional snazzy web UI, then wrap it as an
electron app for people who want a native program to start? (plus
On Thu, 15 Sep 2022, Sindhu Mahadevappa wrote:
We have been looking for the latest Tika 2.4.1 jar file, looks like it
is not available anywhere.
You can get the Tika App and Tika Server jars for 2.4.1 from
https://tika.apache.org/download.html
For the core and parser jars, manually
[
https://issues.apache.org/jira/browse/TIKA-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603483#comment-17603483
]
Nick Burch commented on TIKA-3850:
--
The kind of statistical language model used in Tika struggles
[
https://issues.apache.org/jira/browse/TIKA-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603038#comment-17603038
]
Nick Burch commented on TIKA-3308:
--
Our HTML mime type has both root-XML tags for well-formed documents
On Fri, 9 Sep 2022, Sindhu Mahadevappa wrote:
We are using tika-parsers 1.23
Tika 1.23 was released in December 2019! You should really use something
much more recent
for comparing uploaded file mime type from file name as well as from
file content for security purpose.
Apache Tika's
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575814#comment-17575814
]
Nick Burch commented on TIKA-3832:
--
Any chance you could try with Apache PDFBox directly? They've got
[
https://issues.apache.org/jira/browse/TIKA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch resolved TIKA-3830.
--
Resolution: Duplicate
> Kaspersky identified a file as riskw
[
https://issues.apache.org/jira/browse/TIKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574656#comment-17574656
]
Nick Burch commented on TIKA-3829:
--
Can you share a file that triggers this bug?
The method in question
[
https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566991#comment-17566991
]
Nick Burch commented on TIKA-3814:
--
I have a feeling that the Text content handler might rely
[
https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch updated TIKA-3814:
-
Priority: Trivial (was: Blocker)
> Extracted text from HTML file does not exclude newline chars f
[
https://issues.apache.org/jira/browse/TIKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17562599#comment-17562599
]
Nick Burch commented on TIKA-3811:
--
Maybe [~tallison] has an idea on the config part, he's been working
[
https://issues.apache.org/jira/browse/TIKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17562537#comment-17562537
]
Nick Burch commented on TIKA-3811:
--
You should not be using Apache Tika's detection for anything security
[
https://issues.apache.org/jira/browse/TIKA-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch resolved TIKA-3810.
--
Fix Version/s: 2.4.2
Resolution: Fixed
> Vtt file (encoding UTF-8 with BOM) seen as text/pl
[
https://issues.apache.org/jira/browse/TIKA-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17562532#comment-17562532
]
Nick Burch commented on TIKA-3810:
--
Looks like we had detection magic for the UTF16 variant BOMs
[
https://issues.apache.org/jira/browse/TIKA-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17562484#comment-17562484
]
Nick Burch commented on TIKA-3809:
--
If the uncompressed XML is 250mb, then you're going to need a heap
[
https://issues.apache.org/jira/browse/TIKA-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557343#comment-17557343
]
Nick Burch commented on TIKA-3798:
--
With no file, no thread dump and no stack trace, it won't be easy
[
https://issues.apache.org/jira/browse/TIKA-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557319#comment-17557319
]
Nick Burch commented on TIKA-3798:
--
Do you have a sample file that shows the problem? A thread dump
[
https://issues.apache.org/jira/browse/TIKA-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552078#comment-17552078
]
Nick Burch commented on TIKA-3768:
--
If we can put something into a properly typed + structured metadata
[
https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550223#comment-17550223
]
Nick Burch commented on TIKA-3784:
--
We don't currently have any Mime Magic for PKCS12 files
Based
[
https://issues.apache.org/jira/browse/TIKA-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550216#comment-17550216
]
Nick Burch commented on TIKA-3768:
--
I wouldn't expect to find those in the textual content after parsing
[
https://issues.apache.org/jira/browse/TIKA-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539993#comment-17539993
]
Nick Burch commented on TIKA-3771:
--
The PNG magic is priority 50, which is also what our EML min-match 2
[
https://issues.apache.org/jira/browse/TIKA-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539594#comment-17539594
]
Nick Burch commented on TIKA-3710:
--
As a "normal" html file wouldn't start with thes
[
https://issues.apache.org/jira/browse/TIKA-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539582#comment-17539582
]
Nick Burch commented on TIKA-3710:
--
I was thinking we'd do (open)h1(close) or (open)h1(space) to cover
[
https://issues.apache.org/jira/browse/TIKA-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538896#comment-17538896
]
Nick Burch commented on TIKA-3710:
--
The h1 isn't quite as unique as we might like, and maybe not as good
[
https://issues.apache.org/jira/browse/TIKA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529977#comment-17529977
]
Nick Burch commented on TIKA-3571:
--
Some formats support the concept of pages and we can pass that along
[
https://issues.apache.org/jira/browse/TIKA-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529918#comment-17529918
]
Nick Burch commented on TIKA-3742:
--
Sure! Potentially easiest is if you create your own fork of Tika
[
https://issues.apache.org/jira/browse/TIKA-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529417#comment-17529417
]
Nick Burch commented on TIKA-3742:
--
I believe {{readNBytes}} only came in with Java 9, and the particular
[
https://issues.apache.org/jira/browse/TIKA-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529101#comment-17529101
]
Nick Burch commented on TIKA-3742:
--
Assuming we just want type=17 text elements of a DGNv7 file (as per
[
https://issues.apache.org/jira/browse/TIKA-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529038#comment-17529038
]
Nick Burch commented on TIKA-3742:
--
In theory you shouldn't need any java code at all if you don't want
[
https://issues.apache.org/jira/browse/TIKA-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529029#comment-17529029
]
Nick Burch commented on TIKA-3742:
--
If it can just be run standalone and then {{ExternalParser
[
https://issues.apache.org/jira/browse/TIKA-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528157#comment-17528157
]
Nick Burch commented on TIKA-3731:
--
We already do a prefix for several other formats for custom metadata
[
https://issues.apache.org/jira/browse/TIKA-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527158#comment-17527158
]
Nick Burch commented on TIKA-3719:
--
Linux and Mac will need quotes around arguments containing spaces
[
https://issues.apache.org/jira/browse/TIKA-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526776#comment-17526776
]
Nick Burch commented on TIKA-3721:
--
We already have a few file types which we send to {{OfficeParser
[
https://issues.apache.org/jira/browse/TIKA-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526352#comment-17526352
]
Nick Burch commented on TIKA-3721:
--
The mime types mentioned at
[https://communities.bentley.com
[
https://issues.apache.org/jira/browse/TIKA-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526336#comment-17526336
]
Nick Burch commented on TIKA-3721:
--
We've had the OK from the author of the tika-dgn-detector
I'd
, so if it's a problem, feel free to change
or ignore.
Cheers
On Fri, 22 Apr 2022 at 11:57, Nick Burch wrote:
Hi Steven
Over on https://issues.apache.org/jira/browse/TIKA-3721, one of our users
altered us to your tika-dgn-detector github project.
If possible, we'd like to fold the detector logic
Hi Steven
Over on https://issues.apache.org/jira/browse/TIKA-3721, one of our users
altered us to your tika-dgn-detector github project.
If possible, we'd like to fold the detector logic and mime type
definitions into Tika itself. (Converting it to Java in the process and
putting the
[
https://issues.apache.org/jira/browse/TIKA-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526324#comment-17526324
]
Nick Burch commented on TIKA-3721:
--
That detector is written in Kotlin, but should be pretty easy to re
[
https://issues.apache.org/jira/browse/TIKA-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525747#comment-17525747
]
Nick Burch commented on TIKA-3719:
--
Those look like the steps needed. I'd suggest we create ours
[
https://issues.apache.org/jira/browse/TIKA-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525588#comment-17525588
]
Nick Burch commented on TIKA-3725:
--
Something like OAuth would be pretty different to basic auth, due
[
https://issues.apache.org/jira/browse/TIKA-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525578#comment-17525578
]
Nick Burch commented on TIKA-3719:
--
For testing it, I'd be tempted to create a self-signed certificate
[
https://issues.apache.org/jira/browse/TIKA-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17524718#comment-17524718
]
Nick Burch commented on TIKA-3721:
--
After a quick look, I can't spot any free tools or libraries
[
https://issues.apache.org/jira/browse/TIKA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517818#comment-17517818
]
Nick Burch commented on TIKA-3571:
--
It has been a quite a while since I last used jodconverter
[
https://issues.apache.org/jira/browse/TIKA-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516459#comment-17516459
]
Nick Burch commented on TIKA-3711:
--
I'd lean towards putting the file name as an attribute of the img tag
[
https://issues.apache.org/jira/browse/TIKA-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504378#comment-17504378
]
Nick Burch commented on TIKA-3696:
--
Shouldn't it be more like {{application/x-wacz}} since it isn't
[
https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504150#comment-17504150
]
Nick Burch commented on TIKA-3684:
--
Same as Tika 2.x - pass a {{--config}} flag when you start the server
[
https://issues.apache.org/jira/browse/TIKA-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch resolved TIKA-3694.
--
Fix Version/s: 2.3.1
Resolution: Fixed
> Tika Server endpoint to return more details on a m
[
https://issues.apache.org/jira/browse/TIKA-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502627#comment-17502627
]
Nick Burch commented on TIKA-3694:
--
I've added new HTML and JSON endpoints {{/mime-types/type/subtype
Nick Burch created TIKA-3694:
Summary: Tika Server endpoint to return more details on a mime type
Key: TIKA-3694
URL: https://issues.apache.org/jira/browse/TIKA-3694
Project: Tika
Issue Type
[
https://issues.apache.org/jira/browse/TIKA-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500804#comment-17500804
]
Nick Burch commented on TIKA-3686:
--
Detecting types of text-based files with magic is always going
[
https://issues.apache.org/jira/browse/TIKA-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489597#comment-17489597
]
Nick Burch commented on TIKA-3676:
--
As long as we provide sensible instructions on what to do, I'm happy
[
https://issues.apache.org/jira/browse/TIKA-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480955#comment-17480955
]
Nick Burch commented on TIKA-3656:
--
That POM is your problem, you aren't including any of the container
[
https://issues.apache.org/jira/browse/TIKA-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479981#comment-17479981
]
Nick Burch commented on TIKA-3656:
--
How are you calling Tika? And do you have the office parsers on your
[
https://issues.apache.org/jira/browse/TIKA-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475269#comment-17475269
]
Nick Burch commented on TIKA-3646:
--
I think this is probably the same issue as TIKA-2935 - the same work
On Fri, 7 Jan 2022, Josh Burchard wrote:
I wrote to Tim about making a small update to
https://cwiki.apache.org/confluence/display/TIKA/TikaServerEndpointsCompared
and he suggested that I email this dev list to see if someone could grant
me editor access. Is that a possibility?
Can you sign up
1 - 100 of 2029 matches
Mail list logo