[ https://issues.apache.org/jira/browse/MARMOTTA-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617317#comment-13617317 ]
Sebastian Schaffert commented on MARMOTTA-175: ---------------------------------------------- Hi Raffaele, thanks for the tests. I agree that left joins are usually very efficient. But unfortunately not in combination with a union subselect, because then the databases first have to create a temporary result table to perform the join on (at least this is what I read when googling a bit on the MySQL issue). Maybe it is possible to reformulate the delete statement in a way that the left join does not involve a subselect? Your tests have interesting results. I would have expected NOT IN to be the slowest operation, because it would need to list all ids from the different tables and for big results even do a complete scan. OTOH maybe the databases can optimize this kind of operation. Would be interesting to see with a bigger dataset, because in this example the indexes will easily fit in main memory. I'll try with our GeoNames import on Postgres (140 million triples). For now I solved the issue as described, but this is definately worth investigating more. So I'll reopen the issue to keep it in mind. > Garbage collection on triple tables > ----------------------------------- > > Key: MARMOTTA-175 > URL: https://issues.apache.org/jira/browse/MARMOTTA-175 > Project: Marmotta > Issue Type: Bug > Components: Triple Store > Affects Versions: 3.1-incubating > Environment: Centos 6.4 64b - JDK 1.6.0_38 - MySql 5.1.67 - Tomcat > 7.0.37 > Reporter: Raffaele Palmieri > Assignee: Sebastian Schaffert > Priority: Minor > Labels: garbage, mysql, triplestore > > During garbage collection of triple tables in log there is the following line: > SQL error while executing garbage collection on triples table: You have an > error in your SQL syntax; check the manual that corresponds to your MySQL > server version for the right syntax to use near 'UNION (SELECT triple_id FROM > reasoner_just_supp_triples WHERE triple_id = triple' at line 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira