Re: Inefficient Query

Nikolaos Karalis Wed, 10 May 2017 05:46:33 -0700

Dear Andy,
thank you for replying to my email. I forked the jena repository and added
my changes (https://github.com/nkaralis/jena).
I created three files in the directory layout1:
        FormatterSimpleHive.java, that has the necessary functions in order to
create the tables triples and prefixes
        StoreSimpleHive.java, that creates a layout1/hive store
        TupleLoaderSimpleHive.java, that overrides the function load() in order
to load multiple rows at once. This is a temporary solution.


I also made some changes to the following files:
        /store/StoreFactory.java
        /store/DatabaseType.java
        /util/StoreUtils.java
        /sql/JDBC.java
        /compiler/SDBCompile.java
in order to support the hive database.

This is the link to the project with the user-defined spatial operations:
https://github.com/nkaralis/jenaspatial
I also wanted to ask you, if binary operators that could be used in the
filter clause of a query such as equal(=), not equal(!=), etc. could be
pushed to the underlying database (instead of
fetching the data from the data store and then evaluating the filter
condition)

Best regards,
Nikolaos Karalis

> Hi Nikolaos,
>
> The query pattern generator isn't very sophisticated and more skewed to
> use execution where the data in "close" (i.e. there is a cache or local
> database).
>
> Normally, SDB would send a single SQL query for the two triple patterns
> and have the SQL database engine worry about how best to do this.
>
> But in the log it seems that this isn't happening:
>
> either the query is going through some additional layers that means the
> SDb execution engine isn't getting the whole pattern, or how the Hive
> adapter works is onl yon a per Graph.find basis.
>
> So you have a link to you extended jena-sdb?>
>
>      Andy
>
>
> On 09/05/17 11:20, Nikolaos Karalis wrote:
>> Dear Jena developers,
>>
>> I have extended jena-sdb in order to support Hive Database and also
>> started implementing some user-defined GeoSPARQL functions using
>> jena-arq.
>> I ran the following query:
>>
>>      PREFIX geof: <http://example.org/function#>
>>      SELECT ?s1 ?s2
>>      WHERE {
>>              ?s1 <http://linkedgeodata.org/ontology/asWKT> ?o1 .
>>          ?s2 <http://geo.linkedopendata.gr/gag/ontology/asWKT> ?o2 .
>>          FILTER(geof:sfWithin(?o1, ?o2)) .
>>      }
>>
>> and observed that for each iteration of the resultsSet, for each result
>> for ?s1, ?s2 is computed from scratch. I've attached the logs of the
>> hiveserver2 as well.
>> Is there a way to make this query more efficient?
>>
>> Best regards,
>> Nikolaos Karalis
>>
>

Re: Inefficient Query

Reply via email to