Thanks Julian and Thomas,
* yes I know https://oakutils.appspot.com/generate/index but it wasn't useful to accomplish my task. * I've already post same question in stackoverflow https://stackoverflow.com/questions/78973742/indexing-a-binary-and-searching-with-contains-cannot-find-results I think and hope that a junit or something similar exists in the oak repository with the goal of testing a scenario like mine, which I think is fairly typical: binary data (text/plain or application/pdf), indexed and a full-text search with contains that pulls it up Is anyone able to find it and give me a link please? Cordiali saluti / Best regards, Raffaele Gambelli Senior Java Developer E raffaele.gambe...@cegeka.com<mailto:raffaele.gambe...@cegeka.com> [CEGEKA] Via Ettore Cristoni, 84 IT-40033 Bologna (IT), Italy T +39 02 2544271 WWW.CEGEKA.COM<https://www.cegeka.com> [https://2655225.fs1.hubspotusercontent-na1.net/hubfs/2655225/0.0%20Cegeka%20(new)/1.%20Visuals/Email%20Signatures/Annual_Report_Visuals_2023_Email%20Banner%201.png]<https://www.cegeka.com/it/annual-report-2023?utm_campaign=[EN]%20-%20Annual%20Report%202023&utm_source=email%20signature%20banner&utm_medium=email%20signature%20banner%20annual%20report%202023> Dichiarazione di Riservatezza Le informazioni contenute nella mail sono riservate. Se si rende conto di non essere il destinatario corretto della mail, la preghiamo di segnalare l'errore al mittente e di cancellare immediatamente il messaggio. L’utilizzo improprio di informazioni riservate può comportare sanzioni. Protezione dei dati personali La informiamo che i suoi dati saranno trattati da Cegeka nel rispetto delle disposizioni di legge applicabili (D. Lgs 196/2003 e Regolamento UE 679/2016). Per maggiori dettagli può consultare le nostre informative privacy al link https://www.cegeka.com/it/informazioni-sulla-privacy.<https://www.cegeka.com/it/informazioni-sulla-privacy> ________________________________ From: Thomas Mueller <muel...@adobe.com.INVALID> Sent: Thursday, September 12, 2024 9:19 AM To: users@jackrabbit.apache.org <users@jackrabbit.apache.org> Subject: Re: Indexing a binary and searching with contains, help request [You don't often get email from muel...@adobe.com.invalid. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Hi, I'm not sure if you are aware of the following, it might help: https://oakutils.appspot.com/generate/index https://www.aemstuff.com/blogs/feb/aemindexcheatsheat.html https://experienceleague.adobe.com/docs/experience-manager-65/assets/JCR_query_cheatsheet-v1.1.pdf These were written for the Adobe AEM product, but I find them useful even outside of AEM. And here an example index definition: { "/oak:index/acmeAsset-1": { "compatVersion": 2, "type": "lucene", "tags": ["asset"], "async": ["async", "nrt"], "includedPaths": ["/content/dam"], "jcr:primaryType": "oak:QueryIndexDefinition", "evaluatePathRestrictions": true, "maxFieldLength": 100000, "aggregates": { "jcr:primaryType": "nt:unstructured", "dam:Asset": { "jcr:primaryType": "nt:unstructured", "include0": { "path": "jcr:content", "jcr:primaryType": "nt:unstructured" }, "include1": { "path": "jcr:content/metadata", "jcr:primaryType": "nt:unstructured" }, "include2": { "path": "jcr:content/metadata/*", "jcr:primaryType": "nt:unstructured" }, "include3": { "path": "jcr:content/renditions", "jcr:primaryType": "nt:unstructured" }, "include4": { "path": "jcr:content/renditions/original", "jcr:primaryType": "nt:unstructured" }, "include5": { "path": "jcr:content/renditions/original/jcr:content", "jcr:primaryType": "nt:unstructured" }, "include6": { "path": "jcr:content/comments", "jcr:primaryType": "nt:unstructured" }, "include7": { "path": "jcr:content/comments/*", "jcr:primaryType": "nt:unstructured" }, "include8": { "path": "jcr:content/data/master", "jcr:primaryType": "nt:unstructured" }, "include9": { "path": "jcr:content/usages", "jcr:primaryType": "nt:unstructured" }, "include10": { "path": "jcr:content/renditions/text.txt/jcr:content", "jcr:primaryType": "nt:unstructured" } } }, "facets": { "jcr:primaryType": "nt:unstructured", "topChildren": "100", "secure": "insecure" }, "indexRules": { "jcr:primaryType": "nt:unstructured", "dam:Asset": { "jcr:primaryType": "nt:unstructured", "properties": { "jcr:primaryType": "nt:unstructured", "jcrLastModified": { "ordered": true, "name": "jcr:content/jcr:lastModified", "propertyIndex": true, "jcr:primaryType": "nt:unstructured", "type": "Date" }, "jcrTitle": { "useInSpellcheck": true, "useInSuggest": true, "nodeScopeIndex": true, "name": "jcr:content/jcr:title", "propertyIndex": true, "boost": 2.0, "jcr:primaryType": "nt:unstructured" }, "jcrDescription": { "nodeScopeIndex": true, "useInSpellcheck": true, "name": "jcr:content/jcr:description", "propertyIndex": true, "jcr:primaryType": "nt:unstructured", "useInSuggest": true }, "jcrCreated": { "ordered": true, "name": "jcr:created", "propertyIndex": true, "jcr:primaryType": "nt:unstructured", "type": "Date" }, "nodeName": { "nodeScopeIndex": true, "name": ":nodeName", "jcr:primaryType": "nt:unstructured", "useInSuggest": true }, } } } } } I wonder if nowadays, you would get more answers on stackoverflow.com? I'm not sure... Regards, Thomas