Hi Pavan,

You can apply xdmp:document-filter on many binary formats, including mp3 and 
mp4. It will extract meta information like file size and content mime type, and 
for instance document properties from office documents, and exif tags from 
images. It will also attempt extract actual text, but that will only work if 
such text is inside the file in a machine readable form. E.g. text contained 
inside images or video streams will not be captured. This includes images 
embedded in office docs, image pdf, and also captions and subtitles on images 
and videos. You would need an OCR kind of solution for that..

Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of GUPTA Pavan 
<pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 9:19 AM
To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

Hi Team,

I am trying to ingest the .mp4 and .mp3 file and make them searchable. I have 
studied that these files are considered as binary files.

I have also seen how to make the binary files searchable but I have done for 
.doc, .ppt, .pdf etc file but could not do for .mp4 or .mp3.

Actually I want to make the files searchable.

Can you please direct me how to achieve this and tell me if I need to enable or 
set up any content processing framework for same.\

Thanks In Advance!


Regards,
Pavan
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to