[ 
https://issues.apache.org/jira/browse/MARMOTTA-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617317#comment-13617317
 ] 

Sebastian Schaffert commented on MARMOTTA-175:
----------------------------------------------

Hi Raffaele, 

thanks for the tests. I agree that left joins are usually very efficient. But 
unfortunately not in combination with a union subselect, because then the 
databases first have to create a temporary result table to perform the join on 
(at least this is what I read when googling a bit on the MySQL issue). Maybe it 
is possible to reformulate the delete statement in a way that the left join 
does not involve a subselect?

Your tests have interesting results. I would have expected NOT IN to be the 
slowest operation, because it would need to list all ids from the different 
tables and for big results even do a complete scan. OTOH maybe the databases 
can optimize this kind of operation. Would be interesting to see with a bigger 
dataset, because in this example the indexes will easily fit in main memory. 
I'll try with our GeoNames import on Postgres (140 million triples).

For now I solved the issue as described, but this is definately worth 
investigating more. So I'll reopen the issue to keep it in mind.
                
> Garbage collection on triple tables
> -----------------------------------
>
>                 Key: MARMOTTA-175
>                 URL: https://issues.apache.org/jira/browse/MARMOTTA-175
>             Project: Marmotta
>          Issue Type: Bug
>          Components: Triple Store
>    Affects Versions: 3.1-incubating
>         Environment: Centos 6.4 64b - JDK 1.6.0_38 - MySql 5.1.67 -  Tomcat 
> 7.0.37
>            Reporter: Raffaele Palmieri
>            Assignee: Sebastian Schaffert
>            Priority: Minor
>              Labels: garbage, mysql, triplestore
>
> During garbage collection of triple tables in log there is the following line:
> SQL error while executing garbage collection on triples table: You have an 
> error in your SQL syntax; check the manual that corresponds to your MySQL 
> server version for the right syntax to use near 'UNION (SELECT triple_id FROM 
> reasoner_just_supp_triples WHERE triple_id = triple' at line 1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to