Hi all,

I was able to find results but only looking for any properties, so with a query 
like this:

SELECT * FROM [nt:unstructured] WHERE CONTAINS(*, 'text')

and indexing any properties with an index like this



/oak:index/assetType
  - jcr:primaryType = "oak:QueryIndexDefinition"
  - type = "lucene"
  - compatVersion = 2
  - async = "async"
  + indexRules
    - jcr:primaryType = "nt:unstructured"
    + nt:base
      + properties
        - jcr:primaryType = "nt:unstructured"
        + allProps
          - name = ".*"
          - isRegexp = true
          - nodeScopeIndex = true

This means that search on indexed binary data works but I would like really to 
do it working querying only specific property and indexing only that specific 
property too, this remains a mystery.

Do you have some explanation for this different behaviour?

Thanks


Cordiali saluti / Best regards,

Raffaele Gambelli
Senior Java Developer
E  raffaele.gambe...@cegeka.com<mailto:raffaele.gambe...@cegeka.com>

[CEGEKA]        Via Ettore Cristoni, 84
IT-40033 Bologna (IT), Italy
T +39 02 2544271
WWW.CEGEKA.COM<https://www.cegeka.com/>

[https://2655225.fs1.hubspotusercontent-na1.net/hubfs/2655225/0.0%20Cegeka%20(new)/1.%20Visuals/Email%20Signatures/Annual_Report_Visuals_2023_Email%20Banner%201.png]<https://www.cegeka.com/it/annual-report-2023?utm_campaign=[EN]%20-%20Annual%20Report%202023&utm_source=email%20signature%20banner&utm_medium=email%20signature%20banner%20annual%20report%202023>
Dichiarazione di Riservatezza
Le informazioni contenute nella mail sono riservate. Se si rende conto di non 
essere il destinatario corretto della mail, la preghiamo di segnalare l'errore 
al mittente e di cancellare immediatamente il messaggio. L’utilizzo improprio 
di informazioni riservate può comportare sanzioni.
Protezione dei dati personali
La informiamo che i suoi dati saranno trattati da Cegeka nel rispetto delle 
disposizioni di legge applicabili (D. Lgs 196/2003 e Regolamento UE 679/2016). 
Per maggiori dettagli può consultare le nostre informative privacy al link 
https://www.cegeka.com/it/informazioni-sulla-privacy.<https://www.cegeka.com/it/informazioni-sulla-privacy>


________________________________
From: Raffaele Gambelli <raffaele.gambe...@cegeka.com.INVALID>
Sent: Thursday, September 12, 2024 10:20 AM
To: users@jackrabbit.apache.org <users@jackrabbit.apache.org>
Subject: Re: Indexing a binary and searching with contains, help request

Thanks Julian and Thomas,


  *   yes I know 
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Foakutils.appspot.com%2Fgenerate%2Findex&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232131239%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=RA01QK73yUKSOkjEY7Kmry5%2BpsrGNsb6nSHOd%2Bfh1%2F4%3D&reserved=0<https://oakutils.appspot.com/generate/index>
 but it wasn't useful to accomplish my task.
  *
I've already post same question in stackoverflow 
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F78973742%2Findexing-a-binary-and-searching-with-contains-cannot-find-results&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232146028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=5SFlW5tCCzLCF78yw9BGbYsSDGijPV5KV1nrznYRILU%3D&reserved=0<https://stackoverflow.com/questions/78973742/indexing-a-binary-and-searching-with-contains-cannot-find-results>

I think and hope that a junit or something similar exists in the oak repository 
with the goal of testing a scenario like mine, which I think is fairly typical:

binary data (text/plain or application/pdf), indexed and a full-text search 
with contains that pulls it up

Is anyone able to find it and give me a link please?

Cordiali saluti / Best regards,

Raffaele Gambelli
Senior Java Developer
E  raffaele.gambe...@cegeka.com<mailto:raffaele.gambe...@cegeka.com>

[CEGEKA]        Via Ettore Cristoni, 84
IT-40033 Bologna (IT), Italy
T +39 02 2544271
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cegeka.com%2F&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232156784%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ksHifn4Q47%2BKdNJaQiqgHoPfcNd92PoiZtcmY47L%2Bnc%3D&reserved=0<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2F&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232164843%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=VZmssxVIHJob4jXkWnI5gqV23XuHeRobjttmNgV6lpY%3D&reserved=0><http://www.cegeka.com/>

[https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F2655225.fs1.hubspotusercontent-na1.net%2Fhubfs%2F2655225%2F0.0%2520Cegeka%2520&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232171769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=AEzRQGA%2B%2FzKunNJaZFxNYdGQQ6DrSjnK2WMnaTfMCGU%3D&reserved=0(new)/1.%20Visuals/Email%20Signatures/Annual_Report_Visuals_2023_Email%20Banner%201.png]<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2Fit%2Fannual-report-2023%3Futm_campaign%3D&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232178818%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=U%2Fn2JpUqGjn5bX7VJ6wfn00XD5f2d1Fp8xc88lI5IHo%3D&reserved=0[EN]%20-%20Annual%20Report%202023&utm_source=email%20signature%20banner&utm_medium=email%20signature%20banner%20annual%20report%202023<https://2655225.fs1.hubspotusercontent-na1.net/hubfs/2655225/0.0%20Cegeka%20>>
Dichiarazione di Riservatezza
Le informazioni contenute nella mail sono riservate. Se si rende conto di non 
essere il destinatario corretto della mail, la preghiamo di segnalare l'errore 
al mittente e di cancellare immediatamente il messaggio. L’utilizzo improprio 
di informazioni riservate può comportare sanzioni.
Protezione dei dati personali
La informiamo che i suoi dati saranno trattati da Cegeka nel rispetto delle 
disposizioni di legge applicabili (D. Lgs 196/2003 e Regolamento UE 679/2016). 
Per maggiori dettagli può consultare le nostre informative privacy al link 
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2Fit%2Finformazioni-sulla-privacy&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232185474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=C%2BWqlYZY1VHFXRbgZfakP3bWQm8dW2tAn%2BgvrzGAHdk%3D&reserved=0.<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cegeka.com%2Fit%2Finformazioni-sulla-privacy&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232191941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=tEJZH28Rp8H1GItaNVQKIUDcllTTCqNsJe4fiZvQzB8%3D&reserved=0><https://www.cegeka.com/it/informazioni-sulla-privacy>



________________________________
From: Thomas Mueller <muel...@adobe.com.INVALID>
Sent: Thursday, September 12, 2024 9:19 AM
To: users@jackrabbit.apache.org <users@jackrabbit.apache.org>
Subject: Re: Indexing a binary and searching with contains, help request

[You don't often get email from muel...@adobe.com.invalid. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi,

I'm not sure if you are aware of the following, it might help:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Foakutils.appspot.com%2Fgenerate%2Findex&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232198556%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=JgBIF2VNbLPYSDS1FCXWM1m8jPze491e3%2Fcvbtet5iA%3D&reserved=0<https://oakutils.appspot.com/generate/index>
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.aemstuff.com%2Fblogs%2Ffeb%2Faemindexcheatsheat.html&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232204890%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=r%2FjwMqG28%2BMKgiLTbd4HT79b%2FVTQFj%2BPD%2FEYPVZ0yrU%3D&reserved=0<https://www.aemstuff.com/blogs/feb/aemindexcheatsheat.html>
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fexperienceleague.adobe.com%2Fdocs%2Fexperience-manager-65%2Fassets%2FJCR_query_cheatsheet-v1.1.pdf&data=05%7C02%7CRaffaele.Gambelli%40cegeka.com%7C10da968dbe3d4e6f16ef08dcd303be20%7C42151053019347aa9e81effd81f772cc%7C0%7C0%7C638617260232211374%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=A%2FVQklPET4UdtVuSHpaHYMzTMkBbfb8J4DowAz24vaw%3D&reserved=0<https://experienceleague.adobe.com/docs/experience-manager-65/assets/JCR_query_cheatsheet-v1.1.pdf>

These were written for the Adobe AEM product, but I find them useful even 
outside of AEM.

And here an example index definition:

{
  "/oak:index/acmeAsset-1": {
    "compatVersion": 2,
    "type": "lucene",
    "tags": ["asset"],
    "async": ["async", "nrt"],
    "includedPaths": ["/content/dam"],
    "jcr:primaryType": "oak:QueryIndexDefinition",
    "evaluatePathRestrictions": true,
    "maxFieldLength": 100000,
    "aggregates": {
      "jcr:primaryType": "nt:unstructured",
      "dam:Asset": {
        "jcr:primaryType": "nt:unstructured",
        "include0": {
          "path": "jcr:content",
          "jcr:primaryType": "nt:unstructured"
        },
        "include1": {
          "path": "jcr:content/metadata",
          "jcr:primaryType": "nt:unstructured"
        },
        "include2": {
          "path": "jcr:content/metadata/*",
          "jcr:primaryType": "nt:unstructured"
        },
        "include3": {
          "path": "jcr:content/renditions",
          "jcr:primaryType": "nt:unstructured"
        },
        "include4": {
          "path": "jcr:content/renditions/original",
          "jcr:primaryType": "nt:unstructured"
        },
        "include5": {
          "path": "jcr:content/renditions/original/jcr:content",
          "jcr:primaryType": "nt:unstructured"
        },
        "include6": {
          "path": "jcr:content/comments",
          "jcr:primaryType": "nt:unstructured"
        },
        "include7": {
          "path": "jcr:content/comments/*",
          "jcr:primaryType": "nt:unstructured"
        },
        "include8": {
          "path": "jcr:content/data/master",
          "jcr:primaryType": "nt:unstructured"
        },
        "include9": {
          "path": "jcr:content/usages",
          "jcr:primaryType": "nt:unstructured"
        },
        "include10": {
          "path": "jcr:content/renditions/text.txt/jcr:content",
          "jcr:primaryType": "nt:unstructured"
        }
      }
    },
    "facets": {
      "jcr:primaryType": "nt:unstructured",
      "topChildren": "100",
      "secure": "insecure"
    },
    "indexRules": {
      "jcr:primaryType": "nt:unstructured",
      "dam:Asset": {
        "jcr:primaryType": "nt:unstructured",
        "properties": {
          "jcr:primaryType": "nt:unstructured",
          "jcrLastModified": {
            "ordered": true,
            "name": "jcr:content/jcr:lastModified",
            "propertyIndex": true,
            "jcr:primaryType": "nt:unstructured",
            "type": "Date"
          },
          "jcrTitle": {
            "useInSpellcheck": true,
            "useInSuggest": true,
            "nodeScopeIndex": true,
            "name": "jcr:content/jcr:title",
            "propertyIndex": true,
            "boost": 2.0,
            "jcr:primaryType": "nt:unstructured"
          },
          "jcrDescription": {
            "nodeScopeIndex": true,
            "useInSpellcheck": true,
            "name": "jcr:content/jcr:description",
            "propertyIndex": true,
            "jcr:primaryType": "nt:unstructured",
            "useInSuggest": true
          },
          "jcrCreated": {
            "ordered": true,
            "name": "jcr:created",
            "propertyIndex": true,
            "jcr:primaryType": "nt:unstructured",
            "type": "Date"
          },
          "nodeName": {
            "nodeScopeIndex": true,
            "name": ":nodeName",
            "jcr:primaryType": "nt:unstructured",
            "useInSuggest": true
          },
        }
      }
    }
  }
}


I wonder if nowadays, you would get more answers on stackoverflow.com? I'm not 
sure...

Regards,
Thomas

Reply via email to