Yes. Since this approach has the potential to set precedence for representing structured stuff going forward I wanted to see what others thought before committing directly.
Regards, Ray On July 21, 2014 at 9:45:31 PM, Mattmann, Chris A (3980) ([email protected]) wrote: > Are you able to contribute to tika ? > > Sent from my iPhone > > > On Jul 21, 2014, at 6:43 PM, "Ray Gauss" wrote: > > > > Hi all, > > > > This is a few months old but I've been looking at this recently and since > > we're unlikely > to move to a structured metadata store in the short term I've come up with > what I think is > an interim solution [1] that essentially allows nesting through XPath-like > syntax: > > > > stream[0]/field1=someValue > > stream[0]/field2=otherValue > > stream[1]/field1=yetAnother > > stream[1]/field2=andSoOn > > > > In this case the PBCore metadata standard was used so the terminology is > > 'essenceTracks' > rather than stream and the parser is an ExternalParser configured for FFmpeg > rather > than pure Java. > > > > If that approach seems reasonable we could move things into the main code > > base at some > point. > > > > Regards, > > > > Ray > > > > > > [1] https://github.com/AlfrescoLabs/tika-ffmpeg > > > > > >> On March 28, 2014 at 7:00:31 AM, Nick Burch ([email protected]) wrote: > >>> On Fri, 28 Mar 2014, Konstantin Gribov wrote: > >>> I think you should have three info blocks: video streams, audio streams > >>> and subtitles (if container supports their embedding). Sort naturally or > >>> by vid/aid/sid if present. > >> > >> That's not something Tika supports though. We have a metadata object we > >> can populate with some things, or we can trigger for embedded objects. > >> The Metadata object doesn't support nesting > >> > >> Nick > >> >
