On 06.01.21 12:17, 李惠玲 wrote:
> What we trying to do is after querying a string, the results could show both
> content type triples in the list, if it fits the literals;
>
> Thank you for your replies (and hint), we probably thinking in a wrong way
> about querying RDF type, yes, we should try via SPARQL, not config file.
what does this mean? How do you access your data right now if not via
SPARQL? I mean you put it into a triple store or not?
Something like
select * where {
?s a madsrdf:PersonalName ;
text:query "some_search_string_here"
}
Also, as Andy pointed out, your index creation seems odd. You add an
index on madsrdf:elementList predicate, but according to your sample
data this doesn't link to string literals at all. It should be
madsrdf:authoritativeLabel in your config file
>
> So, we'll keep on fighting!
>
> Thanks again,
> Huiling Lee
> -----Original Message-----
> From: Lorenz Buehmann <[email protected]>
> Sent: Wednesday, January 6, 2021 4:23 PM
> To: [email protected]
> Subject: Re: How to index different types of RDF file in one data set
>
> In addition to what Andy said:
>
> Even if you don't introduce separate subproperties for each type, why
> shouldn't you be able to distinguish both in a query? I mean, there are RDF
> types for both, so just append another triple pattern. I doubt it matters if
> the literals of both types are in the same index.
>
> I mean, the well-known property rdfs:label is also used for any type and
> still people are able to distinguish by type.
>
> So, yes it's possible via SPARQL - if this wasn't clear.
>
> On 05.01.21 21:57, Andy Seaborne wrote:
>> Hi there,
>>
>> I'm not sure what you wish to do - could you sketch a query you want
>> to ask of the data?
>>
>> In a single jena-text Lucene index, all the values of some predicate
>> are indexed in the same Lucene field. Predicates in RDF globally
>> defined relationships.
>>
>> If you want to treat madsrdf:authoritativeLabel in one RDF graph as
>> "PersonalName" and the same predicate madsrdf:authoritativeLabel as
>> "Topic", then it looks like you really have a subproperty hierarchy.
>> Maybe that woudl help.
>>
>> Andy
>>
>>> [
>>> text:field "topic" ;
>>> text:predicate madsrdf:elementList ;
>>> ]
>> madsrdf:elementList is a list so presumably isn't indexed
>>
>>
>> On 05/01/2021 10:48, 李惠玲 wrote:
>>> Dear Sirs,
>>>
>>> Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT version)
>>> and using Lucene (7.7.x) as fulltext search engine.
>>>
>>> Right now, there are two types of RDF files in our triple store, one
>>> is “PersonalName”, the other is “Topic”, when we separate them to
>>> different data set, two config files, they could be indexed
>>> successfully, but “separately”;
>>>
>>> But when tried to index them together, since they have same tag
>>> “madsrdf:authoritativeLabel”, we couldn’t find the instruction of how
>>> to distinguish which is “Topic”, which is “PersonalName”,
>>>
>>> Hope you could share some experiences or suggestion, how to set the
>>> config file to distinguish different types of RDF file correctly?
>>>
>>> Here are two RDF examples:
>>>
>>> Topic:
>>> ---------------------------------------------------------------------
>>> ---------------------------------------------------------
>>>
>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>>> <madsrdf:Topic xmlns:madsrdf="http://www.loc.gov/mads/rdf/v1#"
>>>
>>> rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786">
>>> <rdf:type
>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>> <madsrdf:authoritativeLabel
>>> xml:lang="en">公設辯護</madsrdf:authoritativeLabel>
>> ^^^^
>>> <madsrdf:elementList rdf:parseType="Collection">
>>> <madsrdf:TopicElement>
>>> <madsrdf:elementValue
>>> xml:lang="en">公設辯護</madsrdf:elementValue>
>>> </madsrdf:TopicElement>
>>> </madsrdf:elementList>
>>> <madsrdf:hasVariant>
>>> <madsrdf:Topic>
>>> <rdf:type
>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>> <madsrdf:variantLabel>辯護人</madsrdf:variantLabel>
>>> <madsrdf:elementList rdf:parseType="Collection">
>>> <madsrdf:TopicElement>
>>> <madsrdf:elementValue
>>> xml:lang="en">辯護人</madsrdf:elementValue>
>>> </madsrdf:TopicElement>
>>> </madsrdf:elementList>
>>> </madsrdf:Topic>
>>> </madsrdf:hasVariant>
>>> <identifiers:lccn
>>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/"/>
>>> <identifiers:id
>>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/">(ChTaNC
>>> )sh0001412</identifiers:id>
>>> <madsrdf:adminMetadata>
>>> <ri:RecordInfo
>>> xmlns:ri="http://id.loc.gov/ontologies/RecordInfo#">
>>> <ri:recordChangeDate
>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-12-30T0
>>> 0:00:00</ri:recordChangeDate>
>>> <ri:recordStatus
>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">new</ri:record
>>> Status>
>>> <ri:recordContentSource
>>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>> <ri:languageOfCataloging
>>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>> </ri:RecordInfo>
>>> </madsrdf:adminMetadata>
>>> </madsrdf:Topic>
>>> </rdf:RDF>
>>> ---------------------------------------------------------------------
>>> ------------------------------------------------------------------
>>>
>>>
>>> PersonalName:
>>> ---------------------------------------------------------------------
>>> -------------------------------------------------------------------
>>>
>>> <rdf:RDF>
>>> <madsrdf:PersonalName
>>> rdf:about="http://ld.ncl.edu.tw/authority/981038686683804786981038686
>>> 683804786"> <rdf:type
>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>> <madsrdf:authoritativeLabel xml:lang="en">蘇,
>>> 慧婕</madsrdf:authoritativeLabel>
>>> <madsrdf:elementList rdf:parseType="Collection">
>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">蘇,
>>> 慧婕</madsrdf:elementValue>
>>> </madsrdf:FullNameElement>
>>> </madsrdf:elementList>
>>> <madsrdf:hasVariant>
>>> <madsrdf:PersonalName>
>>> <rdf:type rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>> <madsrdf:variantLabel>Su, Huijie</madsrdf:variantLabel>
>>> <madsrdf:elementList rdf:parseType="Collection">
>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su,
>>> Huijie</madsrdf:elementValue> </madsrdf:FullNameElement>
>>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant>
>>> <madsrdf:hasVariant> <madsrdf:PersonalName> <rdf:type
>>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>> <madsrdf:variantLabel>Su, Hui-Chieh</madsrdf:variantLabel>
>>> <madsrdf:elementList rdf:parseType="Collection">
>>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su,
>>> Hui-Chieh</madsrdf:elementValue> </madsrdf:FullNameElement>
>>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant>
>>> <madsrdf:hasSource> <madsrdf:Source>
>>> <madsrdf:citation-source>論國會議員產生方式之規範及其憲法界限,
>>> 2003:</madsrdf:citation-source>
>>> <madsrdf:citation-note
>>> xml:lang="en">書名頁(國立臺灣大學法律學硏究所碩士)</madsrdf:citation-note>
>>>
>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>> </madsrdf:Source>
>>> </madsrdf:hasSource>
>>> <madsrdf:hasSource>
>>> <madsrdf:Source>
>>> <madsrdf:citation-source>國立臺灣大學法律學系網頁, 檢索日期:
>>> 2020/11/25</madsrdf:citation-source>
>>> <madsrdf:citation-note xml:lang="en">(女; Hui-chieh
>>> Su)</madsrdf:citation-note>
>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>> </madsrdf:Source>
>>> </madsrdf:hasSource>
>>> <madsrdf:hasSource>
>>> <madsrdf:Source>
>>> <madsrdf:citation-source>NTU Scholar(臺大學術典藏)網頁, 檢索日期:
>>> 2020/11/25</madsrdf:citation-source>
>>> <madsrdf:citation-note xml:lang="en">(HUI-CHIEH
>>> SU)</madsrdf:citation-note>
>>> <madsrdf:citation-status>found</madsrdf:citation-status>
>>> </madsrdf:Source>
>>> </madsrdf:hasSource>
>>> <madsrdf:editorialNote>
>>> 臺大教師權威紀錄, 英文權威名稱係以NTU Scholar(臺大學術典藏)網頁著錄(Su,
>>> Hui-Chieh)
>>> </madsrdf:editorialNote>
>>> <madsrdf:note>女; 研究領域: 國家學, 憲法理論, 基本權理論, 言論自由,
>>> 轉型正義</madsrdf:note>
>>> <identifiers:lccn/>
>>> <identifiers:id>(TW-TaNTU)981038686683804786</identifiers:id>
>>> <madsrdf:adminMetadata>
>>> <ri:RecordInfo>
>>> <ri:recordChangeDate
>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-11-25T0
>>> 0:00:00</ri:recordChangeDate>
>>> <ri:recordStatus
>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">new</ri:record
>>> Status>
>>> <ri:recordContentSource
>>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>> <ri:languageOfCataloging
>>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>> </ri:RecordInfo>
>>> </madsrdf:adminMetadata>
>>> </madsrdf:PersonalName>
>>> </rdf:RDF>
>>> ---------------------------------------------------------------------
>>> ---------------------------------------------------------------------
>>> ---------------------------------------------------------------------
>>> -------
>>>
>>>
>>> One of the config files looks like:
>>> ---------------------------------------------------------------------
>>> ------------
>>>
>>> <#entMap> a text:EntityMap ;
>>> text:defaultField "authoritativeLabel" ;
>>> text:entityField "uri" ;
>>> text:uidField "uid" ;
>>> text:langField "lang" ;
>>> text:graphField "graph" ;
>>> text:map (
>>> [
>>> text:field "authoritativeLabel" ;
>>> text:predicate madsrdf:authoritativeLabel ;
>>> ]
>>> [
>>> text:field "variantLabel" ;
>>> text:predicate madsrdf:variantLabel ;
>>> ]
>>> [
>>> text:field "citation-note" ;
>>> text:predicate madsrdf:citation-note ;
>>> ]
>>> [
>>> text:field "citation-source" ;
>>> text:predicate madsrdf:citation-source ;
>>> ]
>>> [
>>> text:field "topic" ;
>>> text:predicate madsrdf:elementList ;
>>> ]
>>> ) .
>>>
>>> ---------------------------------------------------------------------
>>> ----------
>>>
>>>
>>> Thank you for reading this post.
>>>
>>> Best Regards,
>>> Huiling Lee
>>>