Caleb Meier created RYA-408:
-------------------------------

             Summary: PCJ Updater Does Not Support Queries with DIrect Products
                 Key: RYA-408
                 URL: https://issues.apache.org/jira/browse/RYA-408
             Project: Rya
          Issue Type: Bug
          Components: clients
    Affects Versions: 3.2.12
            Reporter: Caleb Meier


A number of optimizations were made to the Rya PCJ Updater to support sharding. 
 Among these optimizations was sharding the binding set results to distribute 
the load among the tablet servers.  The changes that were made to shard the 
rows prevents the JoinResultUpdater from creating joins that are the result of 
direct products.  This is a direct result of how new rows are written in the 
Fluo table.  For example, statement patterns used to be written in the form 
"SP_123/BS_Val1:BS_Val2", but are now written as 
SP:HASH(BS_Val1):123/BS_Val1:BS_Val2",
where HASH(BS_Val1) is the hash of the first binding set value.  After 
sharding, it is impossible to do a targeted range lookup on values 
corresponding to SP_123 without the first binding set value (because the hash 
precedes the id).  So if the JoinResultUpdater attempts to join a new 
StatementPattern result with the results of another StatementPattern and there 
are no common variables (and therefore no first binding value to hash), then 
the updater will not locate any results.  

This issue can be resolved by issuing a more general scan on the "SP" prefix 
and then filtering the results on the StatementPattern nodeId.  This is not a 
very performant approach, but may be the only way to resolve the issue.  Given 
the large amount of data that is currently stored in the Fluo table already, 
there is some question about whether we should support direct products in Fluo 
queries anyway.  Another approach is to simply attempt to optimize queries to 
avoid direct queries when they are register (this should be done anyway), and 
if there is no arrangement that avoid direct products, then throw an exception. 
 Queries that have unavoidable direct products should not be allowed to be 
registered in Fluo.   



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to