Yep, your query works. Even better - I know you're not yet familiar with
all the features of SPARQL - there is inline data concept via e.g.
VALUES clause. This can avoid a scan on the data:

SELECT ?g ?label ?type (COUNT(*) as ?count) {
     VALUES ?type {madsrdf:Topic madsrdf:PersonalName}
     ?g ?p ?o .
     ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
     FILTER (!isBlank(?g)) .
     ?g rdf:type ?type .
     
}

But know you're missing the fulltext search - did you just omit this in your 
query for brevity?

On 07.01.21 11:00, 李惠玲 wrote:
> Hi Lorenz,
>
> Thank you for the reply. 
> At first we thought we need to adjust the config file to achieve what we want 
> to do, so we did few times of adjustments, using "madsrdf:elementList" is one 
> of them (it seems this could index all elements underneath), of course, this 
> didn't work.
> When seeing your replies, Andy mentioned " In a single jena-text Lucene 
> index, all the values of some predicate are indexed in the same Lucene field. 
> Predicates in RDF globally defined relationships.", and you mentioned "it's 
> possible via SPARQL", we thought maybe we've been thinking in the wrong 
> direction, one of the reasons probably is we're not that familiar with SPARQL 
> query syntax.
>
> So, we look further into it, find out there's "FILTER" syntax, so we tried 
> the following query:
>
> SELECT ?g ?label ?type (COUNT(*) as ?count) {
>      ?g ?p ?o .
>      ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
>      FILTER (!isBlank(?g)) .
>      ?g rdf:type ?type .
>      FILTER(?type = madsrdf:Topic || ?type = madsrdf:PersonalName) .
> }
>
> and the config file back to:
>
> text:map (
>         [   
>             text:field "authoritativeLabel" ; 
>             text:predicate madsrdf:authoritativeLabel ;
>         ]
>         [   
>             text:field "variantLabel" ; 
>             text:predicate madsrdf:variantLabel ;
>         ]
>         [   
>             text:field "citation-note" ; 
>             text:predicate madsrdf:citation-note ;
>         ]
>         [   
>             text:field "citation-source" ; 
>             text:predicate madsrdf:citation-source ;
>         ]
>     ) .
>
> After these, right now we could get the query results like this: (looks 
> worked for now)
>
> In search bar: Man
>
> #     Label                                                   Concept
> ---------------------------------------------------------------------------------
> 1     Mann, Klaus, 1906-1949                          PersonalName
> 2     Man                                                             Topic
>
>
> Perhaps I still don't get the point of what you and Andy tried to explain 
> (sorry about this), but what you've said did give us some inspiration, for 
> that, it's greatly appreciated.
>
> Regards,
> Huiling Lee
> -----Original Message-----
> From: Lorenz Buehmann <[email protected]> 
> Sent: Wednesday, January 6, 2021 7:51 PM
> To: [email protected]
> Subject: Re: How to index different types of RDF file in one data set
>
>
> On 06.01.21 12:17, 李惠玲 wrote:
>> What we trying to do is after querying a string, the results could 
>> show both content type triples in the list, if it fits the literals;
>>
>> Thank you for your replies (and hint), we probably thinking in a wrong way 
>> about querying RDF type, yes, we should try via SPARQL, not config file.
> what does this mean? How do you access your data right now if not via SPARQL? 
> I mean you put it into a triple store or not?
>
> Something like
>
> select * where {
> ?s a madsrdf:PersonalName ;
>     text:query "some_search_string_here"
> }
>
>
> Also, as Andy pointed out, your index creation seems odd. You add an index on 
> madsrdf:elementList predicate, but according to your sample data this doesn't 
> link to string literals at all. It should be madsrdf:authoritativeLabel in 
> your config file
>
>> So, we'll keep on fighting!
>>
>> Thanks again,
>> Huiling Lee
>> -----Original Message-----
>> From: Lorenz Buehmann <[email protected]>
>> Sent: Wednesday, January 6, 2021 4:23 PM
>> To: [email protected]
>> Subject: Re: How to index different types of RDF file in one data set
>>
>> In addition to what Andy said:
>>
>> Even if you don't introduce separate subproperties for each type, why 
>> shouldn't you be able to distinguish both in a query? I mean, there are RDF 
>> types for both, so just append another triple pattern. I doubt it matters if 
>> the literals of both types are in the same index.
>>
>> I mean, the well-known property rdfs:label is also used for any type and 
>> still people are able to distinguish by type.
>>
>> So, yes it's possible via SPARQL - if this wasn't clear.
>>
>> On 05.01.21 21:57, Andy Seaborne wrote:
>>> Hi there,
>>>
>>> I'm not sure what you wish to do - could you sketch a query you want 
>>> to ask of the data?
>>>
>>> In a single jena-text Lucene index, all the values of some predicate 
>>> are indexed in the same Lucene field. Predicates in RDF globally 
>>> defined relationships.
>>>
>>> If you want to treat madsrdf:authoritativeLabel in one RDF graph as 
>>> "PersonalName" and the same predicate madsrdf:authoritativeLabel as 
>>> "Topic", then it looks like you really have a subproperty hierarchy.
>>> Maybe that woudl help.
>>>
>>>     Andy
>>>
>>>>           [
>>>>               text:field "topic" ;
>>>>               text:predicate madsrdf:elementList ;
>>>>           ]
>>> madsrdf:elementList is a list so presumably isn't indexed
>>>
>>>
>>> On 05/01/2021 10:48, 李惠玲 wrote:
>>>> Dear Sirs,
>>>>
>>>> Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT 
>>>> version) and using Lucene (7.7.x) as fulltext search engine.
>>>>
>>>> Right now, there are two types of RDF files in our triple store, one 
>>>> is “PersonalName”, the other is “Topic”, when we separate them to 
>>>> different data set, two config files, they could be indexed 
>>>> successfully, but “separately”;
>>>>
>>>> But when tried to index them together, since they have same tag 
>>>> “madsrdf:authoritativeLabel”, we couldn’t find the instruction of 
>>>> how to distinguish which is “Topic”, which is “PersonalName”,
>>>>
>>>> Hope you could share some experiences or suggestion, how to set the 
>>>> config file to distinguish different types of RDF file correctly?
>>>>
>>>> Here are two RDF examples:
>>>>
>>>> Topic:
>>>> --------------------------------------------------------------------
>>>> -
>>>> ---------------------------------------------------------
>>>>
>>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
>>>>     <madsrdf:Topic xmlns:madsrdf="http://www.loc.gov/mads/rdf/v1#";
>>>>                   
>>>> rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786";>
>>>>        <rdf:type
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>>>        <madsrdf:authoritativeLabel
>>>> xml:lang="en">公設辯護</madsrdf:authoritativeLabel>
>>>                                               ^^^^
>>>>        <madsrdf:elementList rdf:parseType="Collection">
>>>>           <madsrdf:TopicElement>
>>>>              <madsrdf:elementValue
>>>> xml:lang="en">公設辯護</madsrdf:elementValue>
>>>>           </madsrdf:TopicElement>
>>>>        </madsrdf:elementList>
>>>>        <madsrdf:hasVariant>
>>>>           <madsrdf:Topic>
>>>>              <rdf:type
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>>>              <madsrdf:variantLabel>辯護人</madsrdf:variantLabel>
>>>>              <madsrdf:elementList rdf:parseType="Collection">
>>>>                 <madsrdf:TopicElement>
>>>>                    <madsrdf:elementValue
>>>> xml:lang="en">辯護人</madsrdf:elementValue>
>>>>                 </madsrdf:TopicElement>
>>>>              </madsrdf:elementList>
>>>>           </madsrdf:Topic>
>>>>        </madsrdf:hasVariant>
>>>>        <identifiers:lccn
>>>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/"/>
>>>>        <identifiers:id
>>>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/";>(ChTaN
>>>> C
>>>> )sh0001412</identifiers:id>
>>>>        <madsrdf:adminMetadata>
>>>>           <ri:RecordInfo
>>>> xmlns:ri="http://id.loc.gov/ontologies/RecordInfo#";>
>>>>              <ri:recordChangeDate
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime";>2020-12-30T
>>>> 0
>>>> 0:00:00</ri:recordChangeDate>
>>>>              <ri:recordStatus
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string";>new</ri:recor
>>>> d
>>>> Status>
>>>>              <ri:recordContentSource 
>>>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>>>              <ri:languageOfCataloging 
>>>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>>>           </ri:RecordInfo>
>>>>        </madsrdf:adminMetadata>
>>>>     </madsrdf:Topic>
>>>> </rdf:RDF>
>>>> --------------------------------------------------------------------
>>>> -
>>>> ------------------------------------------------------------------
>>>>
>>>>
>>>> PersonalName:
>>>> --------------------------------------------------------------------
>>>> -
>>>> -------------------------------------------------------------------
>>>>
>>>> <rdf:RDF>
>>>> <madsrdf:PersonalName
>>>> rdf:about="http://ld.ncl.edu.tw/authority/98103868668380478698103868
>>>> 6
>>>> 683804786"> <rdf:type
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>>> <madsrdf:authoritativeLabel xml:lang="en">蘇,
>>>> 慧婕</madsrdf:authoritativeLabel>
>>>> <madsrdf:elementList rdf:parseType="Collection"> 
>>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">蘇,
>>>> 慧婕</madsrdf:elementValue>
>>>> </madsrdf:FullNameElement>
>>>> </madsrdf:elementList>
>>>> <madsrdf:hasVariant>
>>>> <madsrdf:PersonalName>
>>>> <rdf:type rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>>> <madsrdf:variantLabel>Su, Huijie</madsrdf:variantLabel> 
>>>> <madsrdf:elementList rdf:parseType="Collection"> 
>>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su, 
>>>> Huijie</madsrdf:elementValue> </madsrdf:FullNameElement> 
>>>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant> 
>>>> <madsrdf:hasVariant> <madsrdf:PersonalName> <rdf:type 
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>>> <madsrdf:variantLabel>Su, Hui-Chieh</madsrdf:variantLabel> 
>>>> <madsrdf:elementList rdf:parseType="Collection"> 
>>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su, 
>>>> Hui-Chieh</madsrdf:elementValue> </madsrdf:FullNameElement> 
>>>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant> 
>>>> <madsrdf:hasSource> <madsrdf:Source>
>>>> <madsrdf:citation-source>論國會議員產生方式之規範及其憲法界限,
>>>> 2003:</madsrdf:citation-source>
>>>> <madsrdf:citation-note
>>>> xml:lang="en">書名頁(國立臺灣大學法律學硏究所碩士)</madsrdf:citation-note>
>>>>
>>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>>> </madsrdf:Source>
>>>> </madsrdf:hasSource>
>>>> <madsrdf:hasSource>
>>>> <madsrdf:Source>
>>>> <madsrdf:citation-source>國立臺灣大學法律學系網頁, 檢索日期:
>>>> 2020/11/25</madsrdf:citation-source>
>>>> <madsrdf:citation-note xml:lang="en">(女; Hui-chieh 
>>>> Su)</madsrdf:citation-note> 
>>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>>> </madsrdf:Source>
>>>> </madsrdf:hasSource>
>>>> <madsrdf:hasSource>
>>>> <madsrdf:Source>
>>>> <madsrdf:citation-source>NTU Scholar(臺大學術典藏)網頁, 檢索日期:
>>>> 2020/11/25</madsrdf:citation-source>
>>>> <madsrdf:citation-note xml:lang="en">(HUI-CHIEH 
>>>> SU)</madsrdf:citation-note> 
>>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>>> </madsrdf:Source>
>>>> </madsrdf:hasSource>
>>>> <madsrdf:editorialNote>
>>>> 臺大教師權威紀錄, 英文權威名稱係以NTU Scholar(臺大學術典藏)網頁著錄(Su,
>>>> Hui-Chieh)
>>>> </madsrdf:editorialNote>
>>>> <madsrdf:note>女; 研究領域: 國家學, 憲法理論, 基本權理論, 言論自由,
>>>> 轉型正義</madsrdf:note>
>>>> <identifiers:lccn/>
>>>> <identifiers:id>(TW-TaNTU)981038686683804786</identifiers:id>
>>>> <madsrdf:adminMetadata>
>>>> <ri:RecordInfo>
>>>> <ri:recordChangeDate
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime";>2020-11-25T
>>>> 0
>>>> 0:00:00</ri:recordChangeDate>
>>>> <ri:recordStatus
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string";>new</ri:recor
>>>> d
>>>> Status>
>>>> <ri:recordContentSource
>>>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>>> <ri:languageOfCataloging
>>>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>>> </ri:RecordInfo>
>>>> </madsrdf:adminMetadata>
>>>> </madsrdf:PersonalName>
>>>> </rdf:RDF>
>>>> --------------------------------------------------------------------
>>>> -
>>>> --------------------------------------------------------------------
>>>> -
>>>> --------------------------------------------------------------------
>>>> -
>>>> -------
>>>>
>>>>
>>>> One of the config files looks like:
>>>> --------------------------------------------------------------------
>>>> -
>>>> ------------
>>>>
>>>> <#entMap> a text:EntityMap ;
>>>>      text:defaultField     "authoritativeLabel" ;
>>>>      text:entityField      "uri" ;
>>>>      text:uidField         "uid" ;
>>>>      text:langField        "lang" ;
>>>>      text:graphField       "graph" ;
>>>>      text:map (
>>>>          [
>>>>              text:field "authoritativeLabel" ;
>>>>              text:predicate madsrdf:authoritativeLabel ;
>>>>          ]
>>>>          [
>>>>              text:field "variantLabel" ;
>>>>              text:predicate madsrdf:variantLabel ;
>>>>          ]
>>>>          [
>>>>              text:field "citation-note" ;
>>>>              text:predicate madsrdf:citation-note ;
>>>>          ]
>>>>          [
>>>>              text:field "citation-source" ;
>>>>              text:predicate madsrdf:citation-source ;
>>>>          ]
>>>>          [
>>>>              text:field "topic" ;
>>>>              text:predicate madsrdf:elementList ;
>>>>          ]
>>>>      ) .
>>>>
>>>> --------------------------------------------------------------------
>>>> -
>>>> ----------
>>>>
>>>>
>>>> Thank you for reading this post.
>>>>
>>>> Best Regards,
>>>> Huiling Lee
>>>>

Reply via email to