Yes you should be able to add the following: --set arq:optIndexJoinStrategy=false
I'm not 100% sure that the short form will work, you may need to use the fully expanded form: --set http://jena.hpl.hp.com/ARQ#optIndexJoinStrategy=false However as noted in my email this is new in 2.10.2-SNAPSHOT builds so unless you are using the latest SNAPSHOTs this would have no effect. In all previous releases this particular optimization was always on. Rob On 7/25/13 1:56 PM, "Diogo FC Patrao" <[email protected]> wrote: >Hello > >The better plan for the query you posted would be (1), simply because of >>> the cost of accessing a remote service. But, if the first SERVICEd >>>query >>> would return just a few lines, maybe it would be better to run a >>>couple >>> of >>> times the same query as in (2) than to get all results. >>> >> >> I agree. I started out with (2) because ARQ by default did that. >>However, >> soon after, that wasn't going to work out and so explored a way to do >>(1). >> Now doing (1) but I'm trying to get more out of it. I have to take a >>look >> closer at Rob Vasse's suggestion: >>ARQ.getContext().set(ARQ.**optIndexJoinStrategy, >> false); > > >Yes; it is a great feature that we can turn on and off certain >optimizations! > >Rob, can we turn that on and off by the ARQ command line? > > >> As for optimizing the query, I would try separating the each query >>into a >>> UNION, one part with the OPTIONAL, the other without it. Getting the >>> subproperties, depending on which triplestore you're querying, can be >>> expensive too. If it's Fuseki+TDB and you have access to the server >>> configuration, you could turn on RDFs inference. Also, the order of the >>> triples can influence a lot on the overall query performance - put the >>> triples that return lesser results before the others. >>> >>> Good luck! >>> >> >> I'm not sure I see how UNION can be used as per your suggestion such >>that >> the results contain values for each field. Only one of the variables in >> OPTIONAL is used towards the final output. Duplicating the earlier >>pattern >> plus what was in OPTIONAL is probably not ideal. Did I misunderstand >>you? >> > >Yes, but that was an idea based solely on my experience with RDB. Writing > >SELECT * FROM A WHERE type_id in (1,2) > >can be slower than > >SELECT * FROM A WHERE type_id = 1 >UNION ALL >SELECT * FROM A WHERE type_id = 2 > >, believe me or not. I never really worked with OPTIONALs so I'm guessing >it out of thin air. But I think's worth the shot. > > >> I'll test it with only RDFS inference. >> > >The SPARQL will look better too. > >cheers! > >dfcp > > > >> Based on my tests, the order of the statements are as good as they get. >> >> Thanks for the suggestions. >> >> -Sarven >> >> >>
