Hi to everybody

I am still stuck with the following problem, unfortunately blocking for me.

I get the error

Virtuoso 37000 Error SP031: SPARQL: Internal error: The length of generated
SQL text has exceeded 10000 lines of code

executing the SPARQL query

prefix pest:<http://example.com/resource/>
select * from <http://example.com/resource>
where {
  ?mainProdId    pest:productCode    "0120010" .
  ?prodSynonymId pest:productCode    ?productCode .
  ?prodSynonymId pest:hasProduct     ?mainProdId .
  ?prodSynonymId pest:languageCode   'EN' .
  ?prodSynonymId pest:scientificName ?otherProdScientificName.
  ?prodSynonymId pest:synonymName    ?otherProductName.
  ?prodSynonymId pest:codeNumber     ?otherProductCode .
}


on the virtual graph obtained with the following R2RML mapping

@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix pest: <http://example.com/resource/> .
@prefix aowl: <http://bblfish.net/work/atom-owl/2006-06-06/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<#TriplesMapV_SEM_PRODUCT_SYN_MOD1> a rr:TriplesMap;
    rr:logicalTable [
            rr:tableSchema "DB" ;
            rr:tableOwner "TEST" ;
            rr:tableName "V_SEM_PRODUCT_SYN_MOD1"
    ];
    rr:subjectMap [
        rr:termType rr:IRI  ;
        rr:template "http://example.com/resource/productSynonym-{ID}";;
        rr:class pest:ProductSynonym;
        rr:graph <http://example.com/resource>
    ];
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:hasProduct
        ] ;
        rr:objectMap [
            rr:termType rr:IRI ;
            rr:template "http://example.com/resource/product-{PRODUCT_CODE}";

        ];
    ];
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:productCode ] ;
        rr:objectMap [
            rr:column "PRODUCT_CODE"
        ];
    ] ;
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:codeNumber ] ;
        rr:objectMap [
            rr:column "CODE_NUMBER"
        ];
    ] ;
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:synonymId
        ] ;
        rr:objectMap [
            rr:column "SYNONYM_ID"
        ];
    ] ;
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:productSynonymId
        ] ;
        rr:objectMap [
            rr:column "ID"
        ];
    ] ;
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:synonymName
        ] ;
        rr:objectMap [
            rr:column "SYNONYM_NAME"
        ];
    ] ;
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:languageCode
        ] ;
        rr:objectMap [
            rr:column "LANGUAGE_CD"
        ];
    ] ;
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:scientificName
        ] ;
        rr:objectMap [
            rr:column "SCIENTIFIC_NAME"
        ];
    ] .
<#TriplesMapV_SEM_PRODUCTS1> a rr:TriplesMap;
    rr:logicalTable [
            rr:tableSchema "DB" ;
            rr:tableOwner "TEST" ;
            rr:tableName "V_SEM_PRODUCTS1"
    ];
    rr:subjectMap [
        rr:termType rr:IRI  ;
        rr:template "http://example.com/resource/product-{PRODUCT_CODE}";;
        rr:class pest:Product;
        rr:graph <http://example.com/resource>
    ];
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:productCode
        ] ;
        rr:objectMap [
            rr:column "PRODUCT_CODE"
        ];
    ] ;
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:hasParentProduct
        ] ;
        rr:objectMap [
            rr:termType rr:IRI ;
            rr:template "
http://example.com/resource/product-{PRODUCT_PARENT_CODE}";
        ];
    ];
    rr:predicateObjectMap [
        rr:predicateMap [
            rr:constant pest:productA3B
        ] ;
        rr:objectMap [
            rr:column "A3B"
        ];
    ] .

The mapping references the following 2 tables.

CREATE TABLE DB.TEST.V_SEM_PRODUCTS1   -- DICTIONARY
(
  PRODUCT_CODE VARCHAR(7) NOT NULL,
  TYPE_ID NUMERIC NOT NULL,
  PRODUCT_PARENT_CODE VARCHAR(7),   -- no FK constraint set, because it
would prevent the CSV bulk load.
  A3B NUMERIC NOT NULL,
  CONSTRAINT PK_V_SEM_PRODUCT1 PRIMARY KEY (PRODUCT_CODE)
);
CREATE TABLE DB.TEST.V_SEM_PRODUCT_SYN_MOD1   -- DICTIONARY
(
  ID INTEGER IDENTITY NOT NULL,
  SYNONYM_ID NUMERIC NOT NULL,
  LANGUAGE_CD VARCHAR(2) NOT NULL,
  PRODUCT_CODE VARCHAR(7) NOT NULL,
  CODE_NUMBER VARCHAR(20) NOT NULL,
  SYNONYM_NAME NVARCHAR(2000) NOT NULL,
  SCIENTIFIC_NAME NVARCHAR(2000),
  CONSTRAINT PK_V_SEM_PRODUCT_SYN_MOD1
    PRIMARY KEY (ID) -- PRODUCT_CODE, SYNONYM_NAME not needeed to compose
the PK
);
ALTER TABLE DB.TEST.V_SEM_PRODUCTS1
  ADD CONSTRAINT UK_V_SEM_PRODUCTS1 UNIQUE (PRODUCT_CODE);
ALTER TABLE  DB.TEST.V_SEM_PRODUCT_SYN_MOD1   -- DICTIONARY
  ADD CONSTRAINT FK_V_SEM_PRODUCT_SYN_MOD1 FOREIGN KEY (PRODUCT_CODE)
  REFERENCES DB.TEST.V_SEM_PRODUCTS1(PRODUCT_CODE);


@Ivan you suggested that the error might be caused "*by many RDF Views that
deal with same predicates*". If the predicates are the predicate mapping
that means having many of them pointing to the same table/column. That does
not seem to me the case.

You alternatively suggested that *"the trouble may happen if there is only
one RDF View, but its top-level quad map is not exclusive"*. I suppose that
the mentioned quad map corresponds to the triplesmap contained in the R2RML
mapping, but I do not understand what a not-exclusive quad map would be.

The RDB I am using in the example above is elementary and the corresponding
mapping is quite simple. Yet there is something that makes the SPARQL
engine not behaving in the expected way.



Any suggestion or help from Ivan or anybody else would be greatly
appreciated!

Thanks!

Cheers

Beppe




2017-05-11 16:53 GMT+01:00 Ivan Mikhailov <imikhai...@openlinksw.com>:

> Hello Beppe,
>
> The query seems to be so straightforward that the only reason I can
> imagine is that there are many RDF Views that deals with same
> predicates, so every triple pattern become union of subqueries, one per
> view, and the join of unions become a nightmare when expanded to union
> of joins. Say, given two copies of one RDF View, as a result of
> configuration mistake, the query first becomes join of 10 unions, 2
> cases per union, and then union of 1024 joins, 10 members per join.
>
> Altermatively, the trouble may happen if there is only one RDF View, but
> its top-level quad map is not exclusive. In that case every triple
> pattern is a union of data from virtual graph and data from physical
> triples, the overall result is again a union of 1024 joins.
>
> The optimizer might recognize the "exponential explosion" of number of
> variants and get something reasonable like union of only 32 joins with 4
> union and 6 plain subqueries per join, but there's no warranty that it's
> able to find the right threshold to stop.
>
> To provide better hints, I need to get the texts of all "create quad
> storage" / "alter quad storage" / "create IRI class" etc. statements
> ever used to configure the storage. If they're not preserved, I need the
> exact output of sparql statement
>
> define input:storage "" define output:format "TTL"
> construct { ?s ?p ?o } from virtrdf: where { ?s ?p ?o }
>
> The output is probably 200Kb or more so it's practical to send directly
> to me, not via mail list. I'm also available at Skype as iv_an_ru.
>
> Best Regards,
>
> Ivan Mikhailov
> OpenLink Software
> http://virtuoso.openlinksw.com
>
>
> On Wed, 2017-05-03 at 18:15 +0100, Beppe Mazzola wrote:
> > Hi to everybody
> >
> > I am trying to query the content of a virtual graph, obtained in
> > Virtuoso Open Source via the R2RML mapping.
> >
> > Executing (relatively) simple SPARQL statements I get run-time errors
> > and I do not understand why.
> >
> >
> > Example 1)
> > prefix pest:<http://example.com/dataset/pesticides/resource/>
> > select ?groupCode ?groupName ?productCode ?productName
> > from <http://example.com/dataset/pesticides/resource>
> > where
> > {
> >   ?treeNutsNameId pest:productCode       "0120000" .
> >   ?treeNutsNameId pest:productCode       ?groupCode .
> >   ?treeNutsNameId pest:productName       ?groupName .
> >   ?treeNutsNameId pest:languageCode      "EN" .
> >   ?treeNutsNameId pest:hasProduct        ?treeNutsId .
> >   ?nutId          pest:hasParentProduct  ?treeNutsId .
> >   ?nutId          pest:productCode       ?productCode.
> >   ?nutNameId      pest:productCode       ?productCode.
> >   ?nutNameId      pest:languageCode      "EN" .
> >   ?nutNameId      pest:productName       ?productName .
> > }
> >
> >
> >
> > For who has the patience to read here is the description of the
> > query.
> >
> > It is about a hierarchy of products in which some products can be
> > child (having the attribute hasParentProduct not null) of another
> > product that represents a group,. There is as well a productName class
> > related to a product (referring to this with the attribute hasProduct)
> > that provides the product name in many languages (attributes
> > languageCode,productName).
> >
> > treeNutsId is the IRI of the product group
> > nutId is the IRI of a product belonging to the group
> > treeNutsNameId is the IRI of the product name of the group
> > nutNameId is the IRI of the product name of a group member
> >
> >
> > So the meaning of the whole statement is "give me the list of the
> > products belonging to the group 0120000"
> >
> >
> >
> > Executing the query I get the following error.
> > Virtuoso 37000 Error SP031: SPARQL compiler: Internal error: The SPARQL
> optimizer has failed to process the query with reasonable quality. The
> resulting SQL query is abnormally long. Please paraphrase the SPARQL query.
> > I have to say that the query has worked fine for a while. Then I have
> > modified the DB to host UTF-8 text and loaded UTF-8 text and the error
> > has come up.
> >
> >
> >
> > Example 2
> >
> >
> > prefix
> > pest:<http://ec.europa.eu/semantic_webgate/dataset/pesticides/resource/>
> > select ?productCode ?otherProductCode ?otherProductName
> ?otherProdScientificName
> > from
> > <http://ec.europa.eu/semantic_webgate/dataset/pesticides/resource>
> > where
> > {
> >   ?prodSynonymId pest:scientificName ?otherProdScientificName.
> >   ?prodSynonymId pest:synonymName    ?otherProductName.
> >   ?prodSynonymId pest:codeNumber     ?otherProductCode .
> >   ?mainProdId    pest:productCode    "120010" .
> >   ?prodSynonymId pest:productCode    ?productCode .
> >   ?prodSynonymId pest:hasProduct     ?mainProdId .
> >   ?prodSynonymId pest:languageCode   'EN' .
> > }
> >
> >
> >
> > I spare you any explanation. Just a note: the attributes
> > scientificName and synonymName
> > have the corresponding DB column declared as NVARCHAR(2000) and
> > contain Unicode
> >
> >
> > I get the following error.
> > Virtuoso 37000 Error SP031: SPARQL: Internal error: The length of
> generated SQL text has exceeded 10000 lines of code
> > If I comment out the line with  scientificName the error disappears.
> >
> > I am not a SPARQL expert, even less a Virtuoso expert, so I do not
> > understand if and where I do something wrong.
> >
> >
> > Any chance to "squeeze" the generated SQL statement?
> >
> >
> > Thanks a lot in advance for any hint or remark!
> >
> >
> > Cheers
> >
> >
> > Beppe
>
>
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to