Re: pattern matching and extraction function on strings in syntaxARQ

Lorenz B. Wed, 24 Apr 2019 00:59:01 -0700

> BIND (REPLACE(STR(?s),"[a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+","$0") AS ?email)


replaces the matching email address by the email address itself, so it's
the same as before.

You need to replace everything else by the email address, replace is not
an "extract" function, you can try

BIND
(REPLACE(STR(?url),"[a-zA-Z0-9/:._-]+/([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+)/[a-zA-Z0-9/._-]+","$1")
AS ?email)

Note, I assume that email addresses are wrapped inside / char


> very good Richard, thank you. I was working along these lines with the 
> following
>
> BIND (REPLACE(STR(?url),"[a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+","$0") AS ?email)
>
> where ?url contains the match but binds the entire string again to ?email
>
> eg data:
>
> url = 
> http://www.imagesnippets.com/imgtag/rdf/[email protected]/1598550_10204479279247862_1280347905880818932_o
>
> query
>
> SELECT ?email
> WHERE  {
> ?s ?p ?o
> BIND (REPLACE(STR(?s),"[a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+","$0") AS ?email)
> }
>
> On Tue, Apr 23, 2019 at 6:00 PM Richard Cyganiak <[email protected]> wrote:
>> Hi Marco,
>>
>>> On 23 Apr 2019, at 15:53, Marco Neumann <[email protected]> wrote:
>>>
>>> I think I'm familiar with functions on strings in SPARQL but as far as
>>> I can see there is nothing similar to a grep like pattern matching and
>>> extraction on strings for SPARQL. Or is there one?
>> The replace function does pattern matching and allows extraction of matched 
>> sub-patterns:
>> https://www.w3.org/TR/sparql11-query/#func-replace 
>> <https://www.w3.org/TR/sparql11-query/#func-replace>
>> https://www.w3.org/TR/xpath-functions/#func-replace 
>> <https://www.w3.org/TR/xpath-functions/#func-replace>
>>
>>     replace(input, pattern, replacement)
>>
>> The special “variables” $1, $2, $3, and so on can be used in the replacement 
>> string. They refer to parts of the input that were matched by the first, 
>> second, third, and so on pair of parentheses in the regex pattern. For 
>> example:
>>
>>     replace("23 April 2019", "^([0-9][0-9])", "$1")
>>
>> would return "23" because that is the part of the input matched by the first 
>> (and only) pair of parentheses.
>>
>> Also useful might be Jena’s own apf:strSplit property function:
>> https://jena.apache.org/documentation/query/library-propfunc.html 
>> <https://jena.apache.org/documentation/query/library-propfunc.html>
>>
>> It can split a literal into multiple literals based on a regular expression.
>>
>> Taken together, these two functions can do a wide range of pattern matching 
>> and extraction tasks.
>>
>> Hope that helps,
>> Richard
>
>
-- 
Lorenz Bühmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center

Re: pattern matching and extraction function on strings in syntaxARQ

Reply via email to