Andy Seaborne wrote:
Be good to split out LARQ(++ for Solr etc etc) from ARQ.
Yep.
For others benefit, these are just proof of concepts:
SARQ is the same as LARQ, but using Solr:
https://github.com/castagna/SARQ
EARQ is the same as LARQ, but using ElasticSearch:
https://github.com/castagna/EARQ
... also, I've tried to redesign the code so that it's easy
for people to plug in different indexes. Still not perfect,
but a first step.
Do you want to organise this as a separate LARQ component in JIRA?
Yes, please.
I don't think I can create a new "component" on JIRA, but it's a good idea.
Makes sense to me to have a separate tag if it's a separate code module.
+1
Paolo
On 13/12/10 15:15, Paolo Castagna (JIRA) wrote:
> LARQ as a separate module from ARQ
> ----------------------------------
>
> Key: JENA-9
> URL: https://issues.apache.org/jira/browse/JENA-9
> Project: Jena
> Issue Type: Test
> Components: ARQ
> Reporter: Paolo Castagna
> Priority: Minor
>
>
> LARQ can be extracted from ARQ as a separate module depending on ARQ.
>
> ARQ should not depend on LARQ (to avoid dependency cycles) and it
could check if LARQ is available in the classpath and wire the property
function in dynamically.
>
> LARQ can have a different release cycle from ARQ and people who do
not need free text search will not need to include Lucene in their
classpath.
>
> A separate (experimental) module is available here:
https://jena.svn.sourceforge.net/svnroot/jena/LARQ/trunk/
>
> List of things to do/decide includes:
>
> - Merge JENA-5 fix
> - Upgrade Lucene version to 2.9.3 and fix tests (if there are
failures).
> - Remove code using deprecated Lucene APIs and upgrade to Lucene
3.0.x.
> - Decide how many results to return when the user does not specify
it, 1000? More?
> - Should we use the index to suppress duplicates instead of
in-memory data structures?
> - How do we implement removals/unindex?
> - We could use the Model to decide when there are no more
triples with a specified literal and therefore it's ok to remove it from
Lucene.
> - See how the new NRT capabilities of Lucene can be used from LARQ.
> - Review package names (currently c.h.h.j.sparql.larq and
c.h.h.j.query.larq). Should we move to c.h.h.j.larq.*?
No rush here IMO, depending any bigger repackagization.
Andy
On 13/12/10 15:15, Paolo Castagna (JIRA) wrote:
LARQ as a separate module from ARQ
----------------------------------
Key: JENA-9
URL: https://issues.apache.org/jira/browse/JENA-9
Project: Jena
Issue Type: Test
Components: ARQ
Reporter: Paolo Castagna
Priority: Minor
LARQ can be extracted from ARQ as a separate module depending on ARQ.
ARQ should not depend on LARQ (to avoid dependency cycles) and it
could check if LARQ is available in the classpath and wire the
property function in dynamically.
LARQ can have a different release cycle from ARQ and people who do not
need free text search will not need to include Lucene in their classpath.
A separate (experimental) module is available here:
https://jena.svn.sourceforge.net/svnroot/jena/LARQ/trunk/
List of things to do/decide includes:
- Merge JENA-5 fix
- Upgrade Lucene version to 2.9.3 and fix tests (if there are
failures).
- Remove code using deprecated Lucene APIs and upgrade to Lucene 3.0.x.
- Decide how many results to return when the user does not specify
it, 1000? More?
- Should we use the index to suppress duplicates instead of
in-memory data structures?
- How do we implement removals/unindex?
- We could use the Model to decide when there are no more triples
with a specified literal and therefore it's ok to remove it from Lucene.
- See how the new NRT capabilities of Lucene can be used from LARQ.
- Review package names (currently c.h.h.j.sparql.larq and
c.h.h.j.query.larq). Should we move to c.h.h.j.larq.*?