Yep, your query works. Even better - I know you're not yet familiar with
all the features of SPARQL - there is inline data concept via e.g.
VALUES clause. This can avoid a scan on the data:
SELECT ?g ?label ?type (COUNT(*) as ?count) {
VALUES ?type {madsrdf:Topic madsrdf:PersonalName}
?g ?p ?o .
?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
FILTER (!isBlank(?g)) .
?g rdf:type ?type .
}
But know you're missing the fulltext search - did you just omit this in your
query for brevity?
On 07.01.21 11:00, 李惠玲 wrote:
> Hi Lorenz,
>
> Thank you for the reply.
> At first we thought we need to adjust the config file to achieve what we want
> to do, so we did few times of adjustments, using "madsrdf:elementList" is one
> of them (it seems this could index all elements underneath), of course, this
> didn't work.
> When seeing your replies, Andy mentioned " In a single jena-text Lucene
> index, all the values of some predicate are indexed in the same Lucene field.
> Predicates in RDF globally defined relationships.", and you mentioned "it's
> possible via SPARQL", we thought maybe we've been thinking in the wrong
> direction, one of the reasons probably is we're not that familiar with SPARQL
> query syntax.
>
> So, we look further into it, find out there's "FILTER" syntax, so we tried
> the following query:
>
> SELECT ?g ?label ?type (COUNT(*) as ?count) {
> ?g ?p ?o .
> ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
> FILTER (!isBlank(?g)) .
> ?g rdf:type ?type .
> FILTER(?type = madsrdf:Topic || ?type = madsrdf:PersonalName) .
> }
>
> and the config file back to:
>
> text:map (
> [
> text:field "authoritativeLabel" ;
> text:predicate madsrdf:authoritativeLabel ;
> ]
> [
> text:field "variantLabel" ;
> text:predicate madsrdf:variantLabel ;
> ]
> [
> text:field "citation-note" ;
> text:predicate madsrdf:citation-note ;
> ]
> [
> text:field "citation-source" ;
> text:predicate madsrdf:citation-source ;
> ]
> ) .
>
> After these, right now we could get the query results like this: (looks
> worked for now)
>
> In search bar: Man
>
> # Label Concept
> ---------------------------------------------------------------------------------
> 1 Mann, Klaus, 1906-1949 PersonalName
> 2 Man Topic
>
>
> Perhaps I still don't get the point of what you and Andy tried to explain
> (sorry about this), but what you've said did give us some inspiration, for
> that, it's greatly appreciated.
>
> Regards,
> Huiling Lee
> -----Original Message-----
> From: Lorenz Buehmann <[email protected]>
> Sent: Wednesday, January 6, 2021 7:51 PM
> To: [email protected]
> Subject: Re: How to index different types of RDF file in one data set
>
>
> On 06.01.21 12:17, 李惠玲 wrote:
>> What we trying to do is after querying a string, the results could
>> show both content type triples in the list, if it fits the literals;
>>
>> Thank you for your replies (and hint), we probably thinking in a wrong way
>> about querying RDF type, yes, we should try via SPARQL, not config file.
> what does this mean? How do you access your data right now if not via SPARQL?
> I mean you put it into a triple store or not?
>
> Something like
>
> select * where {
> ?s a madsrdf:PersonalName ;
> text:query "some_search_string_here"
> }
>
>
> Also, as Andy pointed out, your index creation seems odd. You add an index on
> madsrdf:elementList predicate, but according to your sample data this doesn't
> link to string literals at all. It should be madsrdf:authoritativeLabel in
> your config file
>
>> So, we'll keep on fighting!
>>
>> Thanks again,
>> Huiling Lee
>> -----Original Message-----
>> From: Lorenz Buehmann <[email protected]>
>> Sent: Wednesday, January 6, 2021 4:23 PM
>> To: [email protected]
>> Subject: Re: How to index different types of RDF file in one data set
>>
>> In addition to what Andy said:
>>
>> Even if you don't introduce separate subproperties for each type, why
>> shouldn't you be able to distinguish both in a query? I mean, there are RDF
>> types for both, so just append another triple pattern. I doubt it matters if
>> the literals of both types are in the same index.
>>
>> I mean, the well-known property rdfs:label is also used for any type and
>> still people are able to distinguish by type.
>>
>> So, yes it's possible via SPARQL - if this wasn't clear.
>>
>> On 05.01.21 21:57, Andy Seaborne wrote:
>>> Hi there,
>>>
>>> I'm not sure what you wish to do - could you sketch a query you want
>>> to ask of the data?
>>>
>>> In a single jena-text Lucene index, all the values of some predicate
>>> are indexed in the same Lucene field. Predicates in RDF globally
>>> defined relationships.
>>>
>>> If you want to treat madsrdf:authoritativeLabel in one RDF graph as
>>> "PersonalName" and the same predicate madsrdf:authoritativeLabel as
>>> "Topic", then it looks like you really have a subproperty hierarchy.
>>> Maybe that woudl help.
>>>
>>> Andy
>>>
>>>> [
>>>> text:field "topic" ;
>>>> text:predicate madsrdf:elementList ;
>>>> ]
>>> madsrdf:elementList is a list so presumably isn't indexed
>>>
>>>
>>> On 05/01/2021 10:48, 李惠玲 wrote:
>>>> Dear Sirs,
>>>>
>>>> Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT
>>>> version) and using Lucene (7.7.x) as fulltext search engine.
>>>>
>>>> Right now, there are two types of RDF files in our triple store, one
>>>> is “PersonalName”, the other is “Topic”, when we separate them to
>>>> different data set, two config files, they could be indexed
>>>> successfully, but “separately”;
>>>>
>>>> But when tried to index them together, since they have same tag
>>>> “madsrdf:authoritativeLabel”, we couldn’t find the instruction of
>>>> how to distinguish which is “Topic”, which is “PersonalName”,
>>>>
>>>> Hope you could share some experiences or suggestion, how to set the
>>>> config file to distinguish different types of RDF file correctly?
>>>>
>>>> Here are two RDF examples:
>>>>
>>>> Topic:
>>>> --------------------------------------------------------------------
>>>> -
>>>> ---------------------------------------------------------
>>>>
>>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>>>> <madsrdf:Topic xmlns:madsrdf="http://www.loc.gov/mads/rdf/v1#"
>>>>
>>>> rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786">
>>>> <rdf:type
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>>> <madsrdf:authoritativeLabel
>>>> xml:lang="en">公設辯護</madsrdf:authoritativeLabel>
>>> ^^^^
>>>> <madsrdf:elementList rdf:parseType="Collection">
>>>> <madsrdf:TopicElement>
>>>> <madsrdf:elementValue
>>>> xml:lang="en">公設辯護</madsrdf:elementValue>
>>>> </madsrdf:TopicElement>
>>>> </madsrdf:elementList>
>>>> <madsrdf:hasVariant>
>>>> <madsrdf:Topic>
>>>> <rdf:type
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>>> <madsrdf:variantLabel>辯護人</madsrdf:variantLabel>
>>>> <madsrdf:elementList rdf:parseType="Collection">
>>>> <madsrdf:TopicElement>
>>>> <madsrdf:elementValue
>>>> xml:lang="en">辯護人</madsrdf:elementValue>
>>>> </madsrdf:TopicElement>
>>>> </madsrdf:elementList>
>>>> </madsrdf:Topic>
>>>> </madsrdf:hasVariant>
>>>> <identifiers:lccn
>>>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/"/>
>>>> <identifiers:id
>>>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/">(ChTaN
>>>> C
>>>> )sh0001412</identifiers:id>
>>>> <madsrdf:adminMetadata>
>>>> <ri:RecordInfo
>>>> xmlns:ri="http://id.loc.gov/ontologies/RecordInfo#">
>>>> <ri:recordChangeDate
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-12-30T
>>>> 0
>>>> 0:00:00</ri:recordChangeDate>
>>>> <ri:recordStatus
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">new</ri:recor
>>>> d
>>>> Status>
>>>> <ri:recordContentSource
>>>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>>> <ri:languageOfCataloging
>>>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>>> </ri:RecordInfo>
>>>> </madsrdf:adminMetadata>
>>>> </madsrdf:Topic>
>>>> </rdf:RDF>
>>>> --------------------------------------------------------------------
>>>> -
>>>> ------------------------------------------------------------------
>>>>
>>>>
>>>> PersonalName:
>>>> --------------------------------------------------------------------
>>>> -
>>>> -------------------------------------------------------------------
>>>>
>>>> <rdf:RDF>
>>>> <madsrdf:PersonalName
>>>> rdf:about="http://ld.ncl.edu.tw/authority/98103868668380478698103868
>>>> 6
>>>> 683804786"> <rdf:type
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>>> <madsrdf:authoritativeLabel xml:lang="en">蘇,
>>>> 慧婕</madsrdf:authoritativeLabel>
>>>> <madsrdf:elementList rdf:parseType="Collection">
>>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">蘇,
>>>> 慧婕</madsrdf:elementValue>
>>>> </madsrdf:FullNameElement>
>>>> </madsrdf:elementList>
>>>> <madsrdf:hasVariant>
>>>> <madsrdf:PersonalName>
>>>> <rdf:type rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>>> <madsrdf:variantLabel>Su, Huijie</madsrdf:variantLabel>
>>>> <madsrdf:elementList rdf:parseType="Collection">
>>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su,
>>>> Huijie</madsrdf:elementValue> </madsrdf:FullNameElement>
>>>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant>
>>>> <madsrdf:hasVariant> <madsrdf:PersonalName> <rdf:type
>>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>>> <madsrdf:variantLabel>Su, Hui-Chieh</madsrdf:variantLabel>
>>>> <madsrdf:elementList rdf:parseType="Collection">
>>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su,
>>>> Hui-Chieh</madsrdf:elementValue> </madsrdf:FullNameElement>
>>>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant>
>>>> <madsrdf:hasSource> <madsrdf:Source>
>>>> <madsrdf:citation-source>論國會議員產生方式之規範及其憲法界限,
>>>> 2003:</madsrdf:citation-source>
>>>> <madsrdf:citation-note
>>>> xml:lang="en">書名頁(國立臺灣大學法律學硏究所碩士)</madsrdf:citation-note>
>>>>
>>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>>> </madsrdf:Source>
>>>> </madsrdf:hasSource>
>>>> <madsrdf:hasSource>
>>>> <madsrdf:Source>
>>>> <madsrdf:citation-source>國立臺灣大學法律學系網頁, 檢索日期:
>>>> 2020/11/25</madsrdf:citation-source>
>>>> <madsrdf:citation-note xml:lang="en">(女; Hui-chieh
>>>> Su)</madsrdf:citation-note>
>>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>>> </madsrdf:Source>
>>>> </madsrdf:hasSource>
>>>> <madsrdf:hasSource>
>>>> <madsrdf:Source>
>>>> <madsrdf:citation-source>NTU Scholar(臺大學術典藏)網頁, 檢索日期:
>>>> 2020/11/25</madsrdf:citation-source>
>>>> <madsrdf:citation-note xml:lang="en">(HUI-CHIEH
>>>> SU)</madsrdf:citation-note>
>>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>>> </madsrdf:Source>
>>>> </madsrdf:hasSource>
>>>> <madsrdf:editorialNote>
>>>> 臺大教師權威紀錄, 英文權威名稱係以NTU Scholar(臺大學術典藏)網頁著錄(Su,
>>>> Hui-Chieh)
>>>> </madsrdf:editorialNote>
>>>> <madsrdf:note>女; 研究領域: 國家學, 憲法理論, 基本權理論, 言論自由,
>>>> 轉型正義</madsrdf:note>
>>>> <identifiers:lccn/>
>>>> <identifiers:id>(TW-TaNTU)981038686683804786</identifiers:id>
>>>> <madsrdf:adminMetadata>
>>>> <ri:RecordInfo>
>>>> <ri:recordChangeDate
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-11-25T
>>>> 0
>>>> 0:00:00</ri:recordChangeDate>
>>>> <ri:recordStatus
>>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">new</ri:recor
>>>> d
>>>> Status>
>>>> <ri:recordContentSource
>>>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>>> <ri:languageOfCataloging
>>>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>>> </ri:RecordInfo>
>>>> </madsrdf:adminMetadata>
>>>> </madsrdf:PersonalName>
>>>> </rdf:RDF>
>>>> --------------------------------------------------------------------
>>>> -
>>>> --------------------------------------------------------------------
>>>> -
>>>> --------------------------------------------------------------------
>>>> -
>>>> -------
>>>>
>>>>
>>>> One of the config files looks like:
>>>> --------------------------------------------------------------------
>>>> -
>>>> ------------
>>>>
>>>> <#entMap> a text:EntityMap ;
>>>> text:defaultField "authoritativeLabel" ;
>>>> text:entityField "uri" ;
>>>> text:uidField "uid" ;
>>>> text:langField "lang" ;
>>>> text:graphField "graph" ;
>>>> text:map (
>>>> [
>>>> text:field "authoritativeLabel" ;
>>>> text:predicate madsrdf:authoritativeLabel ;
>>>> ]
>>>> [
>>>> text:field "variantLabel" ;
>>>> text:predicate madsrdf:variantLabel ;
>>>> ]
>>>> [
>>>> text:field "citation-note" ;
>>>> text:predicate madsrdf:citation-note ;
>>>> ]
>>>> [
>>>> text:field "citation-source" ;
>>>> text:predicate madsrdf:citation-source ;
>>>> ]
>>>> [
>>>> text:field "topic" ;
>>>> text:predicate madsrdf:elementList ;
>>>> ]
>>>> ) .
>>>>
>>>> --------------------------------------------------------------------
>>>> -
>>>> ----------
>>>>
>>>>
>>>> Thank you for reading this post.
>>>>
>>>> Best Regards,
>>>> Huiling Lee
>>>>