On Wed, 8 Oct 2014, Can Duruk wrote:
My question is regarding setting the metadata keys coming from the parsers
to my own keys.
For my application, I am using Tika to extract the metadata for a bunch of
files. I am using the embedded HTTP server which I modified for my needs to
return instead
Hi all,
Here is my problem: I have extracted plain texts from a serious of doc(x)
documents and their titles via the dc:title label of metadata, but I'm not
sure this is the right way to attain a title of a document. In many cases, a
title inside a document could be of the largest
Perhaps a re-mapping downstream ContentHandler
that takes in the Metadata object and will reformat
the meta name=.. section of the XHTML?
Chris Mattmann
chris.mattm...@gmail.com
-Original Message-
From: Nick Burch apa...@gagravarr.org
Reply-To:
I'd suggest you do the mapping from Tika keys to your keys in the server.
All the parsers should return consistent keys, so the output side is
the
best place to map.
That seems to be the now-obvious solution, thanks for the suggestion.
Perhaps a re-mapping downstream ContentHandler
that takes
I agree with Nick’s recommendation on post-parsing key mapping, and I’d like to
put in a plug for the RecursiveParserWrapper, which may be of use for you.
I’ve been intending to add that to the app commandline and to server…how are
you handling embedded document metadata? Would the wrapper be
I agree with Nick’s recommendation on post-parsing key mapping, and I’d
like to put in a plug for the RecursiveParserWrapper, which may be of use
for you. I’ve been intending to add that to the app commandline and to
server…how are you handling embedded document metadata? Would the wrapper
be