bdas for 2 reasons:
>>> - it's useful only for java-clients;
>>> - it could bring very nasty bugs leading to RCE class vulnerabilities, so
>>> it's very controversial from security PoV.
>>>
>> Sure. I was not actually suggesting to use them in Tika nati
wrote:
> On Thu, 28 Sep 2017, Giuseppe Totaro wrote:
>
>> if I am not wrong, currently you cannot configure a specific
>> ContentHandler
>> while using tika-server. I mean that you can configure your own parser [0]
>> but you cannot control which ContentHand
Hi folks,
if I am not wrong, currently you cannot configure a specific ContentHandler
while using tika-server. I mean that you can configure your own parser [0]
but you cannot control which ContentHandler the parser leverages to extract
text and metadata (e.g., you cannot use
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro resolved TIKA-2449.
---
Resolution: Fixed
Fix Version/s: 1.17
> Enabling extraction of standard references f
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro reassigned TIKA-2449:
-
Assignee: Giuseppe Totaro
> Enabling extraction of standard references from t
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
External issue URL: https://github.com/apache/tika/pull/204 (was:
https://github.com
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Attachment: flowchart_standards_extraction_v02.png
> Enabling extraction of standard referen
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Attachment: (was: flowchart_standards_extraction_v02.png)
> Enabling extraction of stand
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Attachment: flowchart_standards_extraction_v02.png
> Enabling extraction of standard referen
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Attachment: standards_extraction_v02.png
> Enabling extraction of standard references from t
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Attachment: (was: standards_extraction_v02.png)
> Enabling extraction of standard referen
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Description:
Apache Tika currently provides many _ContentHandler_ which help to de-obfuscate
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Description:
Apache Tika currently provides many _ContentHandler_ which help to de-obfuscate
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Description:
Apache Tika currently provides many _ContentHandler_ which help to de-obfuscate
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Description:
Apache Tika currently provides many _ContentHandler_ which help to de-obfuscate
[
https://issues.apache.org/jira/browse/TIKA-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-2449:
--
Attachment: standards_extraction.patch
flowchart_standards_extraction.png
Giuseppe Totaro created TIKA-2449:
-
Summary: Enabling extraction of standard references from text
Key: TIKA-2449
URL: https://issues.apache.org/jira/browse/TIKA-2449
Project: Tika
Issue Type
[
https://issues.apache.org/jira/browse/TIKA-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904761#comment-14904761
]
Giuseppe Totaro commented on TIKA-1739:
---
Great suggestion [~gagravarr]. Thanks [~chrismattmann
[
https://issues.apache.org/jira/browse/TIKA-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903123#comment-14903123
]
Giuseppe Totaro commented on TIKA-1739:
---
Hi [~chrismattmann], Hi [~gagravarr],
I looked at the last
[
https://issues.apache.org/jira/browse/TIKA-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643066#comment-14643066
]
Giuseppe Totaro commented on TIKA-1691:
---
Hi [~gagravarr], Hi [~chrismattmann],
did
[
https://issues.apache.org/jira/browse/TIKA-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637930#comment-14637930
]
Giuseppe Totaro commented on TIKA-1691:
---
Hello [~gagravarr],
your feedback is very
[
https://issues.apache.org/jira/browse/TIKA-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1691:
--
Attachment: mapping_example.pdf
Apache Tika for enabling metadata interoperability
Giuseppe Totaro created TIKA-1691:
-
Summary: Apache Tika for enabling metadata interoperability
Key: TIKA-1691
URL: https://issues.apache.org/jira/browse/TIKA-1691
Project: Tika
Issue Type
[
https://issues.apache.org/jira/browse/TIKA-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro resolved TIKA-1654.
---
Resolution: Fixed
Reset cTAKES CAS into CTAKESParser
[
https://issues.apache.org/jira/browse/TIKA-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1654:
--
Fix Version/s: (was: 1.9)
1.10
Reset cTAKES CAS into CTAKESParser
Giuseppe Totaro created TIKA-1654:
-
Summary: Reset cTAKES CAS into CTAKESParser
Key: TIKA-1654
URL: https://issues.apache.org/jira/browse/TIKA-1654
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1654:
--
Fix Version/s: 1.9
Reset cTAKES CAS into CTAKESParser
[
https://issues.apache.org/jira/browse/TIKA-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1654:
--
Attachment: TIKA-1654.patch
Reset cTAKES CAS into CTAKESParser
Hi Chris,
I have tested tika 1.9-rc2. In particular, I checked the new work on
CTAKESParser.
Thank you for your great work.
My vote for this RC is +1.
Thanks,
Giuseppe
On Mon, Jun 8, 2015 at 8:58 AM, Konstantin Gribov gros...@gmail.com wrote:
Hi, Chris.
SHA1 hash and GPG signature are
[
https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1645:
--
Attachment: TIKA-1645.v02.patch
Extraction of biomedical information using CTAKESParser
[
https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572252#comment-14572252
]
Giuseppe Totaro commented on TIKA-1645:
---
Hi [~chrismattmann], thanks for your
Giuseppe Totaro created TIKA-1645:
-
Summary: Extraction of biomedical information using CTAKESParser
Key: TIKA-1645
URL: https://issues.apache.org/jira/browse/TIKA-1645
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro reassigned TIKA-1645:
-
Assignee: Giuseppe Totaro
Extraction of biomedical information using CTAKESParser
[
https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1645:
--
Attachment: TIKA-1645.patch
Extraction of biomedical information using CTAKESParser
[
https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1645:
--
Labels: patch (was: )
Extraction of biomedical information using CTAKESParser
[
https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1645:
--
Attachment: CTAKESConfig.properties
tika-config.xml
Extraction of biomedical
[
https://issues.apache.org/jira/browse/TIKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro reassigned TIKA-1642:
-
Assignee: Giuseppe Totaro
Integrate cTAKES into Tika
[
https://issues.apache.org/jira/browse/TIKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563993#comment-14563993
]
Giuseppe Totaro commented on TIKA-1642:
---
Hi [~selina], I believe that is a great idea
:27 PM, David Meikle dmei...@apache.org wrote:
Hello All,
Please welcome Giuseppe Totaro as he joins us as the latest Tika committer
and PMC Member.
He's recently been VOTEd in and now has his account all set up so is ready
to roll!
Giuseppe, please feel free to say a bit about yourself
[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1580:
--
Attachment: TIKA-1580.v03.2.Mattmann.Totaro.03262015.patch
ISA-Tab parsers
[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1580:
--
Attachment: TIKA-1580.v03.Mattmann.Totaro.03262015.patch
Hi all, I uploaded a new patch
({{TIKA
://reviews.apache.org/r/32291/diff/
Testing
---
Tested on sample ISA-Tab files downloaded from
http://www.isa-tools.org/format/examples/.
Thanks,
Giuseppe Totaro
[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1580:
--
Attachment: TIKA-1580.v02.patch
ISA-Tab parsers
---
Key: TIKA
[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376210#comment-14376210
]
Giuseppe Totaro commented on TIKA-1580:
---
Hi [~chrismattmann], I apologize about
[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376210#comment-14376210
]
Giuseppe Totaro edited comment on TIKA-1580 at 3/23/15 5:10 PM
[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370945#comment-14370945
]
Giuseppe Totaro commented on TIKA-1580:
---
The patch has been uploaded for review
[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1580:
--
Summary: ISA-Tab parsers (was: ISA-Tab)
ISA-Tab parsers
---
Key
Giuseppe Totaro created TIKA-1580:
-
Summary: ISA-Tab
Key: TIKA-1580
URL: https://issues.apache.org/jira/browse/TIKA-1580
Project: Tika
Issue Type: New Feature
Components: parser
[
https://issues.apache.org/jira/browse/TIKA-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336144#comment-14336144
]
Giuseppe Totaro commented on TIKA-1483:
---
Hi [~chrismattmann], I don't know why
[
https://issues.apache.org/jira/browse/TIKA-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327027#comment-14327027
]
Giuseppe Totaro edited comment on TIKA-1541 at 2/19/15 5:57 AM
[
https://issues.apache.org/jira/browse/TIKA-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1541:
--
Attachment: TIKA-1541.v02.02182015.patch
Hi all,
I added more unit tests, especially
[
https://issues.apache.org/jira/browse/TIKA-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326875#comment-14326875
]
Giuseppe Totaro commented on TIKA-1483:
---
Thanks [~lfcnassif].
I agree with you about
[
https://issues.apache.org/jira/browse/TIKA-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313796#comment-14313796
]
Giuseppe Totaro commented on TIKA-1541:
---
[~chrismattmann], probably it depends
Giuseppe Totaro created TIKA-1541:
-
Summary: StringsParser: a simple strings-based parser for Tika
Key: TIKA-1541
URL: https://issues.apache.org/jira/browse/TIKA-1541
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1541:
--
Description:
I thought to implement an extremely simple implementation of {{StringsParser
[
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297612#comment-14297612
]
Giuseppe Totaro commented on TIKA-1423:
---
[~lewismc] your patch matches perfectly
[
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291207#comment-14291207
]
Giuseppe Totaro commented on TIKA-1423:
---
Hello [~vinegh], I noted in your parser
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892116#comment-13892116
]
Giuseppe Totaro commented on TIKA-1184:
---
Hello,
I've just run the tika-app-1.4.jar
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892116#comment-13892116
]
Giuseppe Totaro edited comment on TIKA-1184 at 2/5/14 1:52 PM
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Comment: was deleted
(was: Hello,
I've just run the tika-app-1.4.jar against files extracted
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892122#comment-13892122
]
Giuseppe Totaro commented on TIKA-1184:
---
Hello,
I've just run the tika-app-1.4.jar
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Attachment: ansi.sys
Infinite halt on parsing old files (e.g. mp3, ms-dos drivers
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Attachment: ansi.sys
Infinite halt on parsing old files (e.g. mp3, ms-dos drivers
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Comment: was deleted
(was: Hello,
I've just run the tika-app-1.4.jar against files extracted
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Attachment: ansi.sys
Hello,
I've just run the tika-app-1.4.jar against files extracted from
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Attachment: (was: ansi.sys)
Infinite halt on parsing old files (e.g. mp3, ms-dos drivers
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Attachment: (was: ansi.sys)
Infinite halt on parsing old files (e.g. mp3, ms-dos drivers
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Attachment: (was: ansi.sys)
Infinite halt on parsing old files (e.g. mp3, ms-dos drivers
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892124#comment-13892124
]
Giuseppe Totaro commented on TIKA-1184:
---
Hello,
I've just run the tika-app-1.4.jar
[
https://issues.apache.org/jira/browse/TIKA-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Giuseppe Totaro updated TIKA-1184:
--
Attachment: ansi.sys
Infinite halt on parsing old files (e.g. mp3, ms-dos drivers
[
https://issues.apache.org/jira/browse/TIKA-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619639#comment-13619639
]
Giuseppe Totaro commented on TIKA-1092:
---
Thanks Nick. I'll give you feedback as soon
[
https://issues.apache.org/jira/browse/TIKA-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602168#comment-13602168
]
Giuseppe Totaro commented on TIKA-1092:
---
Hi Nick,
thanks for your support.
I'll send
[
https://issues.apache.org/jira/browse/TIKA-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601195#comment-13601195
]
Giuseppe Totaro commented on TIKA-1092:
---
Hi Nick,
most files were created in 1992
[
https://issues.apache.org/jira/browse/TIKA-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600065#comment-13600065
]
Giuseppe Totaro commented on TIKA-1092:
---
Hi Nick,
I'm agree with your first
Giuseppe Totaro created TIKA-1092:
-
Summary: Parsing of old Word file causes a TikaException
Key: TIKA-1092
URL: https://issues.apache.org/jira/browse/TIKA-1092
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575242#comment-13575242
]
Giuseppe Totaro commented on TIKA-1081:
---
Thanks Chris :)
Error
76 matches
Mail list logo