[
https://issues.apache.org/jira/browse/TIKA-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza closed TIKA-4262.
---
Assignee: Nicholas DiPiazza
Resolution: Invalid
never mind - this was an issue in my
[
https://issues.apache.org/jira/browse/TIKA-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-4262:
Description:
tika configuration when saving a fetcher with a list of strings will look like
Nicholas DiPiazza created TIKA-4262:
---
Summary: In pipes XML config, List serializes incorrect
causing the parameters to be empty when read
Key: TIKA-4262
URL: https://issues.apache.org/jira/browse/TIKA-4262
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848960#comment-17848960
]
Nicholas DiPiazza commented on TIKA-4243:
-
Sure that sounds good. When we chat later
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845083#comment-17845083
]
Nicholas DiPiazza commented on TIKA-4252:
-
even better
> PipesClient#process - seems to lose the
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845080#comment-17845080
]
Nicholas DiPiazza commented on TIKA-4252:
-
Maybe
fetchInputMetadata
outputMetadata
>
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845071#comment-17845071
]
Nicholas DiPiazza commented on TIKA-4252:
-
sure I can do that.
> PipesClient#process - seems to
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845071#comment-17845071
]
Nicholas DiPiazza edited comment on TIKA-4252 at 5/9/24 5:08 PM:
-
sure I
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845061#comment-17845061
]
Nicholas DiPiazza edited comment on TIKA-4252 at 5/9/24 4:50 PM:
-
What I
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845061#comment-17845061
]
Nicholas DiPiazza edited comment on TIKA-4252 at 5/9/24 4:50 PM:
-
What I
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845061#comment-17845061
]
Nicholas DiPiazza commented on TIKA-4252:
-
What I need is to be able to send "Fetch Metadata" such
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza closed TIKA-4252.
---
Fix Version/s: 3.0.0
Resolution: Fixed
> PipesClient#process - seems to lose the Fetch
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845010#comment-17845010
]
Nicholas DiPiazza commented on TIKA-4252:
-
done
> PipesClient#process - seems to lose the Fetch
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-4252:
Description:
when calling:
PipesResult pipesResult = pipesClient.process(new
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-4252:
Description:
when calling:
PipesResult pipesResult = pipesClient.process(new
Nicholas DiPiazza created TIKA-4252:
---
Summary: PipesClient#process - seems to lose the Fetch input
metadata?
Key: TIKA-4252
URL: https://issues.apache.org/jira/browse/TIKA-4252
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842622#comment-17842622
]
Nicholas DiPiazza commented on TIKA-4243:
-
Kinda seems like it might belong in tika-config module
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842622#comment-17842622
]
Nicholas DiPiazza edited comment on TIKA-4243 at 5/1/24 12:34 PM:
--
Kinda
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842158#comment-17842158
]
Nicholas DiPiazza edited comment on TIKA-4243 at 4/29/24 8:56 PM:
--
this
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842158#comment-17842158
]
Nicholas DiPiazza commented on TIKA-4243:
-
this seems like a major feature thing so i would
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842157#comment-17842157
]
Nicholas DiPiazza commented on TIKA-4243:
-
[https://github.com/joelittlejohn/jsonschema2pojo
Nicholas DiPiazza created TIKA-4247:
---
Summary: HttpFetcher - add ability to send request headers
Key: TIKA-4247
URL: https://issues.apache.org/jira/browse/TIKA-4247
Project: Tika
Issue
Nicholas DiPiazza created TIKA-4243:
---
Summary: tika configuration overhaul
Key: TIKA-4243
URL: https://issues.apache.org/jira/browse/TIKA-4243
Project: Tika
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-4243:
Description:
In 3.0.0 when dealing with Tika, it would greatly help to have a Typed
Nicholas DiPiazza created TIKA-4237:
---
Summary: Add JWT authentication ability to the http fetcher
Key: TIKA-4237
URL: https://issues.apache.org/jira/browse/TIKA-4237
Project: Tika
Issue
Nicholas DiPiazza created TIKA-4229:
---
Summary: add microsoft graph fetcher
Key: TIKA-4229
URL: https://issues.apache.org/jira/browse/TIKA-4229
Project: Tika
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-4181:
Attachment: image-2024-02-06-07-54-50-116.png
> Grpc + Tika Pipes - pipe iterator and
[
https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-4181:
Description:
Add full tika-pipes support of grpc
* pipe iterator
* fetcher
* emitter
[
https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805762#comment-17805762
]
Nicholas DiPiazza edited comment on TIKA-4181 at 1/11/24 6:25 PM:
--
Tika
[
https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805762#comment-17805762
]
Nicholas DiPiazza commented on TIKA-4181:
-
Tika pipes could get a full fledged service that could
[
https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-4181:
Description:
Add full tika-pipes support of grpc
* pipe iterator
* fetcher
* emitter
Nicholas DiPiazza created TIKA-4181:
---
Summary: Grpc + Tika Pipes - pipe iterator and emitter
Key: TIKA-4181
URL: https://issues.apache.org/jira/browse/TIKA-4181
Project: Tika
Issue Type:
[
https://issues.apache.org/jira/browse/TIKA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3979:
Attachment: image-2023-02-25-12-01-40-311.png
> OneNoteParser - Improve performance for
[
https://issues.apache.org/jira/browse/TIKA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693512#comment-17693512
]
Nicholas DiPiazza commented on TIKA-3979:
-
old and new appear to be the same binary equivalent
[
https://issues.apache.org/jira/browse/TIKA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692989#comment-17692989
]
Nicholas DiPiazza commented on TIKA-3970:
-
So on Windows PC I log into
[
https://issues.apache.org/jira/browse/TIKA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692984#comment-17692984
]
Nicholas DiPiazza commented on TIKA-3970:
-
> Should we reverse the iteration order of the pages? I
Nicholas DiPiazza created TIKA-3881:
---
Summary: fix testAttachingADebuggerOnTheForkedParserShouldWork
test - do not use hard coded port
Key: TIKA-3881
URL: https://issues.apache.org/jira/browse/TIKA-3881
[
https://issues.apache.org/jira/browse/TIKA-3879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza resolved TIKA-3879.
-
Resolution: Implemented
> add test containers test for s3 fetcher, emitter and pipe
Nicholas DiPiazza created TIKA-3879:
---
Summary: add test containers test for s3 fetcher, emitter and pipe
iterators
Key: TIKA-3879
URL: https://issues.apache.org/jira/browse/TIKA-3879
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601463#comment-17601463
]
Nicholas DiPiazza commented on TIKA-3835:
-
Yeah quickly realizing in my case, because i have solr
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578666#comment-17578666
]
Nicholas DiPiazza edited comment on TIKA-3835 at 8/11/22 8:53 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578666#comment-17578666
]
Nicholas DiPiazza edited comment on TIKA-3835 at 8/11/22 8:52 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578666#comment-17578666
]
Nicholas DiPiazza commented on TIKA-3835:
-
[~tallison] i was wondering same thing. For now just
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578591#comment-17578591
]
Nicholas DiPiazza commented on TIKA-3835:
-
i added a bunch more edits. done. ha sorry if that
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578583#comment-17578583
]
Nicholas DiPiazza edited comment on TIKA-3835 at 8/11/22 5:37 PM:
--
Yes
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578583#comment-17578583
]
Nicholas DiPiazza commented on TIKA-3835:
-
Yes good point. I didn't point out some important
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Description:
Tika pipes should have an optional configuration to archive parsed results.
[
https://issues.apache.org/jira/browse/TIKA-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3835:
Summary: tika pipes parse cache - avoid re-parsing content that has not
changed (was:
Nicholas DiPiazza created TIKA-3835:
---
Summary: parse cache - avoid re-parsing content that has not
changed
Key: TIKA-3835
URL: https://issues.apache.org/jira/browse/TIKA-3835
Project: Tika
Nicholas DiPiazza created TIKA-3821:
---
Summary: Pulsar Tika Pipes Support
Key: TIKA-3821
URL: https://issues.apache.org/jira/browse/TIKA-3821
Project: Tika
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/TIKA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3821:
Description:
add pulsar support to tika pipes:
* pulsar pipe iterator
* pulsar emitter
Nicholas DiPiazza created TIKA-3820:
---
Summary: Kafka Tika Pipes Support
Key: TIKA-3820
URL: https://issues.apache.org/jira/browse/TIKA-3820
Project: Tika
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/TIKA-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526632#comment-17526632
]
Nicholas DiPiazza edited comment on TIKA-3725 at 4/22/22 7:03 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526632#comment-17526632
]
Nicholas DiPiazza commented on TIKA-3725:
-
[~tallison] in my case I have a bunch of other
[
https://issues.apache.org/jira/browse/TIKA-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526557#comment-17526557
]
Nicholas DiPiazza commented on TIKA-3725:
-
I am a couple weeks out of needing this too, and I'll
[
https://issues.apache.org/jira/browse/TIKA-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480514#comment-17480514
]
Nicholas DiPiazza commented on TIKA-3659:
-
I will need to add a `smbj` client for SMB2/3 and
[
https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17455997#comment-17455997
]
Nicholas DiPiazza commented on TIKA-3446:
-
[~tallison] Do I need to do anything to make sure this
[
https://issues.apache.org/jira/browse/TIKA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421848#comment-17421848
]
Nicholas DiPiazza edited comment on TIKA-3561 at 9/29/21, 1:06 AM:
---
Tika
[
https://issues.apache.org/jira/browse/TIKA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421848#comment-17421848
]
Nicholas DiPiazza edited comment on TIKA-3561 at 9/29/21, 1:01 AM:
---
Tika
[
https://issues.apache.org/jira/browse/TIKA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421848#comment-17421848
]
Nicholas DiPiazza commented on TIKA-3561:
-
Tika needs a lot of memory to parse a nested file like
[
https://issues.apache.org/jira/browse/TIKA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3561:
Attachment: out.tar.gz
> Tika throwing java.lang.OutOfMemoryError
>
[
https://issues.apache.org/jira/browse/TIKA-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386791#comment-17386791
]
Nicholas DiPiazza edited comment on TIKA-3495 at 7/24/21, 11:56 PM:
[
https://issues.apache.org/jira/browse/TIKA-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386791#comment-17386791
]
Nicholas DiPiazza edited comment on TIKA-3495 at 7/24/21, 11:56 PM:
[
https://issues.apache.org/jira/browse/TIKA-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386791#comment-17386791
]
Nicholas DiPiazza edited comment on TIKA-3495 at 7/24/21, 11:55 PM:
[
https://issues.apache.org/jira/browse/TIKA-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386791#comment-17386791
]
Nicholas DiPiazza commented on TIKA-3495:
-
[~tallison] i created a PR adding nested document use
Nicholas DiPiazza created TIKA-3455:
---
Summary: Create new tika pipes integration test that uses the rest
api
Key: TIKA-3455
URL: https://issues.apache.org/jira/browse/TIKA-3455
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza reassigned TIKA-3455:
---
Assignee: Nicholas DiPiazza
> Create new tika pipes integration test that uses the
[
https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370146#comment-17370146
]
Nicholas DiPiazza commented on TIKA-3446:
-
Talked to Microsoft open docs people and they informed
[
https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3446:
Description:
While doing some parsing of OneNote documents, I was investigating a slew of
Nicholas DiPiazza created TIKA-3446:
---
Summary: OneNote - look into adding support for OneNote 365
documents
Key: TIKA-3446
URL: https://issues.apache.org/jira/browse/TIKA-3446
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17360174#comment-17360174
]
Nicholas DiPiazza commented on TIKA-3441:
-
No we have not seen this. We do not have a huge amount
[
https://issues.apache.org/jira/browse/TIKA-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302739#comment-17302739
]
Nicholas DiPiazza commented on TIKA-3324:
-
Wow you move fast. that's awesome. This will be super
[
https://issues.apache.org/jira/browse/TIKA-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302650#comment-17302650
]
Nicholas DiPiazza edited comment on TIKA-3324 at 3/16/21, 4:05 PM:
---
[
https://issues.apache.org/jira/browse/TIKA-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302650#comment-17302650
]
Nicholas DiPiazza commented on TIKA-3324:
-
[~tallison] can you attach your intellij project config
Nicholas DiPiazza created TIKA-3317:
---
Summary: Tika Pipes - add a solr fetch iterator
Key: TIKA-3317
URL: https://issues.apache.org/jira/browse/TIKA-3317
Project: Tika
Issue Type:
[
https://issues.apache.org/jira/browse/TIKA-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289960#comment-17289960
]
Nicholas DiPiazza commented on TIKA-3305:
-
ok thanks! just making sure.
> How do you handle PDFs
[
https://issues.apache.org/jira/browse/TIKA-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza closed TIKA-3305.
---
Resolution: Won't Fix
> How do you handle PDFs with custom encoding?
>
Nicholas DiPiazza created TIKA-3305:
---
Summary: How do you handle PDFs with custom encoding?
Key: TIKA-3305
URL: https://issues.apache.org/jira/browse/TIKA-3305
Project: Tika
Issue Type:
[
https://issues.apache.org/jira/browse/TIKA-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3305:
Attachment: custom-encoding.pdf
> How do you handle PDFs with custom encoding?
>
[
https://issues.apache.org/jira/browse/TIKA-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas DiPiazza updated TIKA-3305:
Description:
how do you parse a pdf with custom encoding? when i parse it i get garbage
[
https://issues.apache.org/jira/browse/TIKA-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280295#comment-17280295
]
Nicholas DiPiazza commented on TIKA-3294:
-
[~tallison] definitely!
> Usage of "ECB" mode for
[
https://issues.apache.org/jira/browse/TIKA-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278307#comment-17278307
]
Nicholas DiPiazza commented on TIKA-3282:
-
That is correct. Sorry I'm late to the party. I emailed
[
https://issues.apache.org/jira/browse/TIKA-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17265253#comment-17265253
]
Nicholas DiPiazza commented on TIKA-3226:
-
Want me to add the http one?
> Add custom connector
[
https://issues.apache.org/jira/browse/TIKA-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17265252#comment-17265252
]
Nicholas DiPiazza commented on TIKA-3226:
-
[~tallison] so far so good!
> Add custom connector
[
https://issues.apache.org/jira/browse/TIKA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261557#comment-17261557
]
Nicholas DiPiazza commented on TIKA-1735:
-
Here is the spec:
[
https://issues.apache.org/jira/browse/TIKA-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17259808#comment-17259808
]
Nicholas DiPiazza edited comment on TIKA-3258 at 1/6/21, 3:33 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17259808#comment-17259808
]
Nicholas DiPiazza edited comment on TIKA-3258 at 1/6/21, 3:32 PM:
--
1 - 100 of 218 matches
Mail list logo