Hi all, I was able to find results but only looking for any properties, so with a query like this:
SELECT * FROM [nt:unstructured] WHERE CONTAINS(*, 'text') and indexing any properties with an index like this /oak:index/assetType - jcr:primaryType = "oak:QueryIndexDefinition" - type = "lucene" - compatVersion = 2 - async = "async" + indexRules - jcr:primaryType = "nt:unstructured" + nt:base + properties - jcr:primaryType = "nt:unstructured" + allProps - name = ".*" - isRegexp = true - nodeScopeIndex = true This means that search on indexed binary data works but I would like really to do it working querying only specific property and indexing only that specific property too, this remains a mystery. Do you have some explanation for this different behaviour? Thanks Cordiali saluti / Best regards, Raffaele Gambelli Senior Java Developer E raffaele.gambe...@cegeka.com<mailto:raffaele.gambe...@cegeka.com> [CEGEKA] Via Ettore Cristoni, 84 IT-40033 Bologna (IT), Italy T +39 02 2544271 WWW.CEGEKA.COM<https://www.cegeka.com/> [https://2655225.fs1.hubspotusercontent-na1.net/hubfs/2655225/0.0%20Cegeka%20(new)/1.%20Visuals/Email%20Signatures/Annual_Report_Visuals_2023_Email%20Banner%201.png]<https://www.cegeka.com/it/annual-report-2023?utm_campaign=[EN]%20-%20Annual%20Report%202023&utm_source=email%20signature%20banner&utm_medium=email%20signature%20banner%20annual%20report%202023> Dichiarazione di Riservatezza Le informazioni contenute nella mail sono riservate. Se si rende conto di non essere il destinatario corretto della mail, la preghiamo di segnalare l'errore al mittente e di cancellare immediatamente il messaggio. L’utilizzo improprio di informazioni riservate può comportare sanzioni. Protezione dei dati personali La informiamo che i suoi dati saranno trattati da Cegeka nel rispetto delle disposizioni di legge applicabili (D. Lgs 196/2003 e Regolamento UE 679/2016). Per maggiori dettagli può consultare le nostre informative privacy al link https://www.cegeka.com/it/informazioni-sulla-privacy.<https://www.cegeka.com/it/informazioni-sulla-privacy> ________________________________ From: Raffaele Gambelli <raffaele.gambe...@cegeka.com.INVALID> Sent: Thursday, September 12, 2024 10:20 AM To: users@jackrabbit.apache.org <users@jackrabbit.apache.org> Subject: Re: Indexing a binary and searching with contains, help request Thanks Julian and Thomas, * yes I know https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Foakutils.appspot.com%2Fgenerate%2Findex&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232131239%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=RA01QK73yUKSOkjEY7Kmry5%2BpsrGNsb6nSHOd%2Bfh1%2F4%3D&reserved=0<https://oakutils.appspot.com/generate/index> but it wasn't useful to accomplish my task. * I've already post same question in stackoverflow https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F78973742%2Findexing-a-binary-and-searching-with-contains-cannot-find-results&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232146028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=5SFlW5tCCzLCF78yw9BGbYsSDGijPV5KV1nrznYRILU%3D&reserved=0<https://stackoverflow.com/questions/78973742/indexing-a-binary-and-searching-with-contains-cannot-find-results> I think and hope that a junit or something similar exists in the oak repository with the goal of testing a scenario like mine, which I think is fairly typical: binary data (text/plain or application/pdf), indexed and a full-text search with contains that pulls it up Is anyone able to find it and give me a link please? Cordiali saluti / Best regards, Raffaele Gambelli Senior Java Developer E raffaele.gambe...@cegeka.com<mailto:raffaele.gambe...@cegeka.com> [CEGEKA] Via Ettore Cristoni, 84 IT-40033 Bologna (IT), Italy T +39 02 2544271 https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cegeka.com%2F&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232156784%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ksHifn4Q47%2BKdNJaQiqgHoPfcNd92PoiZtcmY47L%2Bnc%3D&reserved=0<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2F&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232164843%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=VZmssxVIHJob4jXkWnI5gqV23XuHeRobjttmNgV6lpY%3D&reserved=0><http://www.cegeka.com/> [https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F2655225.fs1.hubspotusercontent-na1.net%2Fhubfs%2F2655225%2F0.0%2520Cegeka%2520&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232171769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=AEzRQGA%2B%2FzKunNJaZFxNYdGQQ6DrSjnK2WMnaTfMCGU%3D&reserved=0(new)/1.%20Visuals/Email%20Signatures/Annual_Report_Visuals_2023_Email%20Banner%201.png]<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2Fit%2Fannual-report-2023%3Futm_campaign%3D&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232178818%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=U%2Fn2JpUqGjn5bX7VJ6wfn00XD5f2d1Fp8xc88lI5IHo%3D&reserved=0[EN]%20-%20Annual%20Report%202023&utm_source=email%20signature%20banner&utm_medium=email%20signature%20banner%20annual%20report%202023<https://2655225.fs1.hubspotusercontent-na1.net/hubfs/2655225/0.0%20Cegeka%20>> Dichiarazione di Riservatezza Le informazioni contenute nella mail sono riservate. Se si rende conto di non essere il destinatario corretto della mail, la preghiamo di segnalare l'errore al mittente e di cancellare immediatamente il messaggio. L’utilizzo improprio di informazioni riservate può comportare sanzioni. Protezione dei dati personali La informiamo che i suoi dati saranno trattati da Cegeka nel rispetto delle disposizioni di legge applicabili (D. Lgs 196/2003 e Regolamento UE 679/2016). Per maggiori dettagli può consultare le nostre informative privacy al link https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2Fit%2Finformazioni-sulla-privacy&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232185474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=C%2BWqlYZY1VHFXRbgZfakP3bWQm8dW2tAn%2BgvrzGAHdk%3D&reserved=0.<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2Fit%2Finformazioni-sulla-privacy&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232191941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=tEJZH28Rp8H1GItaNVQKIUDcllTTCqNsJe4fiZvQzB8%3D&reserved=0><https://www.cegeka.com/it/informazioni-sulla-privacy> ________________________________ From: Thomas Mueller <muel...@adobe.com.INVALID> Sent: Thursday, September 12, 2024 9:19 AM To: users@jackrabbit.apache.org <users@jackrabbit.apache.org> Subject: Re: Indexing a binary and searching with contains, help request [You don't often get email from muel...@adobe.com.invalid. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Hi, I'm not sure if you are aware of the following, it might help: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Foakutils.appspot.com%2Fgenerate%2Findex&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232198556%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=JgBIF2VNbLPYSDS1FCXWM1m8jPze491e3%2Fcvbtet5iA%3D&reserved=0<https://oakutils.appspot.com/generate/index> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.aemstuff.com%2Fblogs%2Ffeb%2Faemindexcheatsheat.html&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232204890%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=r%2FjwMqG28%2BMKgiLTbd4HT79b%2FVTQFj%2BPD%2FEYPVZ0yrU%3D&reserved=0<https://www.aemstuff.com/blogs/feb/aemindexcheatsheat.html> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fexperienceleague.adobe.com%2Fdocs%2Fexperience-manager-65%2Fassets%2FJCR_query_cheatsheet-v1.1.pdf&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232211374%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=A%2FVQklPET4UdtVuSHpaHYMzTMkBbfb8J4DowAz24vaw%3D&reserved=0<https://experienceleague.adobe.com/docs/experience-manager-65/assets/JCR_query_cheatsheet-v1.1.pdf> These were written for the Adobe AEM product, but I find them useful even outside of AEM. And here an example index definition: { "/oak:index/acmeAsset-1": { "compatVersion": 2, "type": "lucene", "tags": ["asset"], "async": ["async", "nrt"], "includedPaths": ["/content/dam"], "jcr:primaryType": "oak:QueryIndexDefinition", "evaluatePathRestrictions": true, "maxFieldLength": 100000, "aggregates": { "jcr:primaryType": "nt:unstructured", "dam:Asset": { "jcr:primaryType": "nt:unstructured", "include0": { "path": "jcr:content", "jcr:primaryType": "nt:unstructured" }, "include1": { "path": "jcr:content/metadata", "jcr:primaryType": "nt:unstructured" }, "include2": { "path": "jcr:content/metadata/*", "jcr:primaryType": "nt:unstructured" }, "include3": { "path": "jcr:content/renditions", "jcr:primaryType": "nt:unstructured" }, "include4": { "path": "jcr:content/renditions/original", "jcr:primaryType": "nt:unstructured" }, "include5": { "path": "jcr:content/renditions/original/jcr:content", "jcr:primaryType": "nt:unstructured" }, "include6": { "path": "jcr:content/comments", "jcr:primaryType": "nt:unstructured" }, "include7": { "path": "jcr:content/comments/*", "jcr:primaryType": "nt:unstructured" }, "include8": { "path": "jcr:content/data/master", "jcr:primaryType": "nt:unstructured" }, "include9": { "path": "jcr:content/usages", "jcr:primaryType": "nt:unstructured" }, "include10": { "path": "jcr:content/renditions/text.txt/jcr:content", "jcr:primaryType": "nt:unstructured" } } }, "facets": { "jcr:primaryType": "nt:unstructured", "topChildren": "100", "secure": "insecure" }, "indexRules": { "jcr:primaryType": "nt:unstructured", "dam:Asset": { "jcr:primaryType": "nt:unstructured", "properties": { "jcr:primaryType": "nt:unstructured", "jcrLastModified": { "ordered": true, "name": "jcr:content/jcr:lastModified", "propertyIndex": true, "jcr:primaryType": "nt:unstructured", "type": "Date" }, "jcrTitle": { "useInSpellcheck": true, "useInSuggest": true, "nodeScopeIndex": true, "name": "jcr:content/jcr:title", "propertyIndex": true, "boost": 2.0, "jcr:primaryType": "nt:unstructured" }, "jcrDescription": { "nodeScopeIndex": true, "useInSpellcheck": true, "name": "jcr:content/jcr:description", "propertyIndex": true, "jcr:primaryType": "nt:unstructured", "useInSuggest": true }, "jcrCreated": { "ordered": true, "name": "jcr:created", "propertyIndex": true, "jcr:primaryType": "nt:unstructured", "type": "Date" }, "nodeName": { "nodeScopeIndex": true, "name": ":nodeName", "jcr:primaryType": "nt:unstructured", "useInSuggest": true }, } } } } } I wonder if nowadays, you would get more answers on stackoverflow.com? I'm not sure... Regards, Thomas