I use the following methods to create the connection and the model:
myStore =SDBFactory.connectStore(assemblerFile); model
=SDBFactory.connectDefaultModel(myStore);
myStore.getTableFormatter().create(); model.read(data1); model.read(data2);
Nikolaos Karalis
On 10/5/2017 18:29, Andy Seaborne wrote:
For some reason, onyl
I'd expect the SQL query to be (slightly reformatted output of sdbprint)
SELECTV_4=?o2
T_1.s AS V_1, T_1.o AS V_2,
T_2.s AS V_3, T_2.o AS V_4
FROM
-- ?s1 <http://linkedgeodata.org/ontology/asWKT> ?o1
Triples AS T_1
INNER JOIN
-- ?s2 <http://geo.linkedopendata.gr/gag/ontology/asWKT> ?o2
Triples AS T_2
ON ( T_1.p = '<http://linkedgeodata.org/ontology/asWKT>'
AND T_2.p = '<http://geo.linkedopendata.gr/gag/ontology/asWKT>'
)
Note the INNER JOIN.
but that can't be used if the query is coming through graph.find when
only a single triple pattern at a time is presented to the SQL
transaction.
This is what it would look like
SELECT
T_1.s AS V_2
FROM Triples AS T_1<http://linkedgeodata.org/ontology/asWKT> ?o1
WHERE ( T_1.p = '<http://linkedgeodata.org/ontology/asWKT>'
)
which is what the log shows.
So is a graph put into a general dataset or is this via
DatasetGraphSDB (which triggers the more general processing)?
Andy
On 10/05/17 13:45, Nikolaos Karalis wrote:
Dear Andy,
thank you for replying to my email. I forked the jena repository and
added
my changes (https://github.com/nkaralis/jena).
I created three files in the directory layout1:
FormatterSimpleHive.java, that has the necessary functions in
order to
create the tables triples and prefixes
StoreSimpleHive.java, that creates a layout1/hive store
TupleLoaderSimpleHive.java, that overrides the function load() in
order
to load multiple rows at once. This is a temporary solution.
I also made some changes to the following files:
/store/StoreFactory.java
/store/DatabaseType.java
/util/StoreUtils.java
/sql/JDBC.java
/compiler/SDBCompile.java
in order to support the hive database.
This is the link to the project with the user-defined spatial
operations:
https://github.com/nkaralis/jenaspatial
I also wanted to ask you, if binary operators that could be used in the
filter clause of a query such as equal(=), not equal(!=), etc. could be
pushed to the underlying database (instead of
fetching the data from the data store and then evaluating the filter
condition)
Best regards,
Nikolaos Karalis
Hi Nikolaos,
The query pattern generator isn't very sophisticated and more skewed to
use execution where the data in "close" (i.e. there is a cache or local
database).
Normally, SDB would send a single SQL query for the two triple patterns
and have the SQL database engine worry about how best to do this.
But in the log it seems that this isn't happening:
either the query is going through some additional layers that means the
SDb execution engine isn't getting the whole pattern, or how the Hive
adapter works is onl yon a per Graph.find basis.
So you have a link to you extended jena-sdb?>
Andy
On 09/05/17 11:20, Nikolaos Karalis wrote:
Dear Jena developers,
I have extended jena-sdb in order to support Hive Database and also
started implementing some user-defined GeoSPARQL functions using
jena-arq.
I ran the following query:
PREFIX geof: <http://example.org/function#>
SELECT ?s1 ?s2
WHERE {
?s1 <http://linkedgeodata.org/ontology/asWKT> ?o1 .
?s2 <http://geo.linkedopendata.gr/gag/ontology/asWKT> ?o2 .
FILTER(geof:sfWithin(?o1, ?o2)) .
}
and observed that for each iteration of the resultsSet, for each
result
for ?s1, ?s2 is computed from scratch. I've attached the logs of the
hiveserver2 as well.
Is there a way to make this query more efficient?
Best regards,
Nikolaos Karalis