Github user osma commented on the pull request:
https://github.com/apache/jena/pull/51#issuecomment-92644591
If you want to use jena-text with Fuseki, you need to attach an assembler
description. Read the configuration section from Text searches with SPARQL.
I know, I've
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-97329735
1. I'm not familiar with assembler configuration. But if you want to give
some help ;-)
I'll try. I've done two jena-text patches in the past, and in both cases I
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-97699765
Hi Alexis! Thanks for the code!
This just screams for unit tests that show that it actually works in all
cases...
I'm a bit worried about the BORDER_DELIMITER
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-98649858
Indeed, both of them will be dropped (however, can be solved with multi
language proposal)
I'm curious, how would the multi language proposal help with this problem
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-97152034
Hi Alexis!
Hi Osma, it's now ok for merge.
Excellent!
For the other part, I will propose soon the triple deletion which clean
the related entry
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-97003751
This pull request now requires resolving merge conflicts. I think it
applied cleanly when it was initiated. Is this because of the Jena3 switch that
is ongoing? @afs
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-99066248
There was a discussion thread around this topic on the Jena dev mailing
list in late February, starting with this message:
http://mail-archives.apache.org/mod_mbox/jena-dev
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-99062926
Multilingual index manages dynamically one index per language. Hence for
the case where we have two same literals with different languages, they will be
not stored
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-99096043
[copying from the email I sent to the dev list, apparently the bot didn't
notice my message]
Thanks a lot Chris, this was indeed very useful! So you're already doing
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-101017747
Great tests!
I wonder if there isn't a better method to convert 3 letter ISO 639
language codes to the 2 letter equivalents. But since there is only a
relatively
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-100212921
Configuring via the assembler is very good! And I see you have also added
some tests. Excellent!
For the tests, I think you could add a few more test cases
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-101336857
Oh, right. Hmm.
How about this: TextIndexLucene currently calls IndexWriter.addDocument()
and IndexWriter.updateDocument(). But there are versions of these methods
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-101232751
But how to change correctly the existent code to target Lucene taking
that extra language into account ?
You are currently already parsing the property function
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-102297487
Thanks! I think decoupling these would be a good thing for other purposes
too. For example, I have some plans to propose storing (optionally) the full
literal values
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-102144875
Hi, I see that you already changed your code - impressive work!
One more suggestion - I hope it won't come too late, since you've already
moved code from
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-101780700
This looks very promising. As I've said before, I think a single index is
preferable to multiple indexes. It should require less book-keeping overall.
I have to say I
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-102406631
TextIndexConfiguration should be about : analyzer, queryAnalyzer,
multilingual,.. and graphField, langField concern EntityDefinition no ?
Moreover, to avoid the same
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-102413569
I prefer TextIndexConfiguration, it will be easier to add future conf
parameters.
Ok.
About the lang:xx, I think that extra params should be generalized
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/51#issuecomment-92086474
Hi Alexis!
For the moment, it's the responsability of the client code to commit or
rollback the index (finishIndexing() or abortIndexing() methods) in phase
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-103778663
with a minus before to exclude filled values
Ah, I see. Sorry for the confusion. I had different expectations and the
expression is a bit hard to read. I think
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/72#issuecomment-107835602
I have addressed Andy's point-wise comments in separate commits. The only
remaining issue is whether to make the API change for `search` methods.
After some research
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-109996262
Thanks for fixing, and sorry for causing the conflict with #72.
It's good that you've added unit tests, however I think there could be more
of them
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-110428103
Excellent! Looks good to me now, I'm in favor of merging this.
Only minor thing I noticed: there are now two places where the SHA-256
checksum is calculated. I
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/81
jena-text stored literals: initial functionality and tests for Lucene
This PR implements a feature where it's possible to store the original
literal values in the jena-text Lucene index and to access
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/81#issuecomment-115768545
@afs Sure, now there is one:
[JENA-978](https://issues.apache.org/jira/browse/JENA-978)
---
If your project is set up for it, you can reply to this email and have your
reply
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/72#issuecomment-106794036
I updated my branch to fix merge conflicts caused by recent jena-text
commits. Would this be ready for merging? @afs ?
---
If your project is set up for it, you can reply
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-106801397
@amiara514 What's the status of this? Obviously there are now some merge
conflicts that need to be resolved, but other than that, do you think it's
ready for merging? I
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-106868586
yes maybe the unique ID could be configurable. Hence the feature would be
a kind of: Enabling deletion mode. But insertions without this option enabled
are unremovable
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/72#issuecomment-107341239
Thanks a lot Andy for your review! I will address your points soon and push
a new version.
Thank you also for pointing out Iter.map(), I actually wanted something
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/72#issuecomment-107954622
I decided to not bother with CMS but attached a diff into JIRA, it is
available at
https://issues.apache.org/jira/secure/attachment/12736839/jena-text-score-doc.diff
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-105229534
Excellent!
I played a bit with this code. I noticed one potential issue: if I do a
language specific query with a text:query argument such as 'lang:en
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/72
Add (?uri ?score) to jena-text
This is an implementation of
[JENA-916](https://issues.apache.org/jira/browse/JENA-916), adding support for
using a 2-element list as the subject of a text:query triple
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-105236007
@afs: I reviewed your [EntityDefinition
additions](https://github.com/apache/jena/commit/66a1eda822d8f551fac06d6b0a2672decdc2)
and they look fine to me. But should
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/53#issuecomment-104231598
My impression is that this is on hold, Alex has been pushing for #64 to be
merged first. @amiara514 should confirm.
Content-wise, these are not very closely related
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/52#issuecomment-104210829
@rvesse please look at pull request #64 instead, this one is obsolete I
think, and should be closed
---
If your project is set up for it, you can reply to this email
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/64#issuecomment-103646019
Excellent work! Looks almost ready for merging.
A couple of comments:
1.You say that ?s text:query (rdfs:label 'word' 'lang:none') will target
unlocalized
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/96
JENA-1058: ASCIIFoldingLowerCaseKeywordAnalyzer for jena-text
See https://issues.apache.org/jira/browse/JENA-1058
You can merge this pull request into a Git repository by running:
$ git pull https
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/96#issuecomment-152211292
The class names are getting ridiculously long, but I just followed the same
pattern as before.
---
If your project is set up for it, you can reply to this email and have
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/96#issuecomment-153977494
I drop this in favor of JENA-1062 / PR #97
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user osma closed the pull request at:
https://github.com/apache/jena/pull/96
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/97#issuecomment-153981033
Good questions @rvesse !
Right now (before this PR) one can either use a few generic,
non-language-specific Analyzers: StandardAnalyzer, SimpleAnalyzer
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/97
JENA-1062: configurable Lucene analyzer for jena-text
This is a configurable Analyzer implementation for jena-text / Lucene. It
is similar to what can be achieved in [Solr
configuration](https
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/96#issuecomment-152983824
@afs Yes there is, in the jena-text documentation, though calling it a list
is a bit of a stretch since there are only four currently (StandardAnalyzer,
SimpleAnalyzer
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/97#issuecomment-154092054
Thanks @ajs6f for your detailed notes! I fixed the things you pointed out
but I think especially the Guava Sets.newHashSet pattern could be applied in
many other places
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/97#issuecomment-154353023
@ajs6f :
>If you would like me to make a PR against jena-text looking for this kind
of thing (using Guava or Java 8 idioms to shorten things up) I'm happy to
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/97#issuecomment-154063083
@ajs6f You're right, I did that now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/97#issuecomment-154468394
Yes, of course I will also update the jena-text documentation.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-149117326
Does this code notice when the underlying data is modified? E.g. if I do a
PUT between GET requests, is the cache invalidated? Not just after 5 minutes
but immediately
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-149196979
> Update operations must invalid the cache. A simple way is to simply
invalidate the whole cache. It is very hard to determine whether an update
affects a cache en
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-149565303
@ajs6f @afs Thanks for the pointers, you're right that Last-Modified, ETag
etc is a different issue. Sorry for mixing things up.
---
If your project is set up for it, you
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/97#issuecomment-157324689
Rebased on current master and squashed my commits into one, preparing to
merge to Apache git
---
If your project is set up for it, you can reply to this email and have your
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/111#issuecomment-164839149
All the (rather small) changes to PropertyFunctionBase seemed to be related
to each other, but not directly to the changes in TextQueryPF, so I just
reverted the whole file
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/111
Fix for JENA-1093: revert JENA-999 and add unit test ensuring that all
matching literals are returned by jena-text
See https://issues.apache.org/jira/browse/JENA-1093
You can merge this pull request
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/111#issuecomment-165021049
I squashed the two previous comments to avoid back-and-forth changes to
PropertyFunctionBase. So in effect the new commit now only reverts the changes
made to TextQueryPF
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/112#issuecomment-168650932
I used the Streams expression as suggested by @ajs6f and also amended the
unit test so that it will check for proper filtering. Unless anyone objects I'm
going to merge
Github user osma commented on a diff in the pull request:
https://github.com/apache/jena/pull/112#discussion_r48709186
--- Diff:
jena-text/src/main/java/org/apache/jena/query/text/TextQueryPF.java ---
@@ -210,44 +212,36 @@ private QueryIterator variableSubject(Binding
binding
Github user osma closed the pull request at:
https://github.com/apache/jena/pull/112
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/119#issuecomment-169590779
@ajs6f There are two layers of map-like things here. The outer one is the
actual cache (query parameters to result set), and LinkedHashMap is currently
used
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/119#issuecomment-169611251
Oh right, I didn't notice `getOrFill`.
So am I right that the consensus would be to
1. switch to one of the Jena Cache implementations instead of using
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/119
JENA-999: jena-text Lucene cache using multimaps
This set of commits implements a caching layer for Lucene queries. The
cache is stored in the Context so that it is persisted even when new
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/119#issuecomment-170214419
I have now implemented the changes. I went for a 10-slot Atlas cache, which
is a LRU cache in my understanding though the decision is left to the
CacheFactory. I think it's
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/112
JENA-1093: return multiple literals from text query with bound subject
Now both variableSubject and boundSubject can return multiple bindings with
different literals.
I factored out
Github user osma commented on a diff in the pull request:
https://github.com/apache/jena/pull/112#discussion_r48288750
--- Diff:
jena-text/src/main/java/org/apache/jena/query/text/TextQueryPF.java ---
@@ -210,44 +212,36 @@ private QueryIterator variableSubject(Binding
binding
Github user osma commented on the pull request:
https://github.com/apache/jena/commit/421707f535f3f79f3f66db0565b63ff5391dc7e5#commitcomment-15123390
@afs Did you commit the TextQueryPF change (changing the use of
TextHitConverter to a lambda expression) on purpose? It's
Github user osma commented on the pull request:
https://github.com/apache/jena/commit/421707f535f3f79f3f66db0565b63ff5391dc7e5#commitcomment-15123700
@afs Thanks for clarifying!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/101#issuecomment-158934853
I see, thanks for the clarification. But I think my question about what
kind of code we want is still valid - after all, this isn't my project, but
maintained collectively
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/101#issuecomment-158932591
Thanks @ajs6f - there's quite a lot of fixes in here!
In general, I feel positive about cleanups like this. And it's nice to have
a patch which actually *reduces
Github user osma closed the pull request at:
https://github.com/apache/jena/pull/119
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/119#issuecomment-170452765
I decided to merge as-is (after rebase), i.e. using `getOrFill` and the
lambda function. It's not too unclear either way. I'm just not yet used to this
style of Java
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/122#issuecomment-171934539
Hmm, there are several LICENSE files in the Jena source tree which mention
"code contributed by Plugged In Software". It appears that some of these have
old
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/122#issuecomment-171937845
Okay I copied the changes from apache-jena. It seems that the pathnames
were already fixed in the `apache-jena/LICENSE` file so they are now fixed in
`jena-sdb/dist/LICENSE
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/122#issuecomment-171940432
I will merge this soon to master unless someone opposes...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user osma commented on a diff in the pull request:
https://github.com/apache/jena/pull/122#discussion_r49832744
--- Diff: jena-sdb/assembly.xml ---
@@ -41,11 +41,11 @@
- dist/LICENSE
+ LICENSE
--- End diff --
I changed
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/122#issuecomment-171906648
It works for me. Also, jena-fuseki1 seems to do the same.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user osma commented on a diff in the pull request:
https://github.com/apache/jena/pull/122#discussion_r49832973
--- Diff: jena-sdb/assembly.xml ---
@@ -41,11 +41,11 @@
- dist/LICENSE
+ LICENSE
--- End diff --
Sorry, I'm
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/122#issuecomment-171908145
Ah, hm. I didn't realize that. I was thinking that it should simply be
possible to build this distribution, just like it is currently possible to
build the SDB jar
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/122
JENA-1118: bringing back the SDB distribution
See https://issues.apache.org/jira/browse/JENA-1118
You can merge this pull request into a Git repository by running:
$ git pull https://github.com
Github user osma commented on a diff in the pull request:
https://github.com/apache/jena/pull/122#discussion_r49833529
--- Diff: jena-sdb/assembly.xml ---
@@ -41,11 +41,11 @@
- dist/LICENSE
+ LICENSE
--- End diff --
Ah, now I
Github user osma commented on a diff in the pull request:
https://github.com/apache/jena/pull/122#discussion_r49833662
--- Diff: jena-sdb/assembly.xml ---
@@ -41,11 +41,11 @@
- dist/LICENSE
+ LICENSE
--- End diff --
Removed
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/122#issuecomment-171910727
Ah, okay. So these were actually produced for a while after the 1.3.6
release, but not put on the download site, just available via Maven Central.
Okay, I think getting back
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/131#issuecomment-206240991
Merged after rebasing.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user osma closed the pull request at:
https://github.com/apache/jena/pull/131
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/131
JENA-1134: support AnalyzingQueryParser in jena-text
This PR makes it possible to select either the standard Lucene QueryParser
or the AnalyzingQueryParser using jena-text configuration like
Github user osma commented on a diff in the pull request:
https://github.com/apache/jena/pull/131#discussion_r57891059
--- Diff:
jena-text/src/test/java/org/apache/jena/query/text/TestDatasetWithAnalyzingQueryParser.java
---
@@ -0,0 +1,64 @@
+/**
+ * Licensed
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/131#issuecomment-204270064
Any other comments? I will merge this some time next week unless someone
objects.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-213367009
I got interested in how this would affect our Fuseki performance if it were
merged. This became quite an experiment and I'm reporting the results here.
First of all I
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-213564399
@ajs6f said:
> If Jena is going to do really useful caching, it's going to be caching
something other than bits, something specific to what Jena does, like
node-tup
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-213566543
@afs I tried to build your fuseki-cache branch to test it but `mvn package`
resulted in a lot of unit test failures. Maybe I'm doing something wrong
though. I'll try again
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/146#issuecomment-219629095
I think the code is now good for merging. Thanks a lot!
@afs Can I just merge the code to master, or is there some legal policy
that must be followed
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/146#issuecomment-219381471
Thanks for the PR!
The code looks good, as you say it's following the pattern established by
JENA-1134.
You are right about the unit tests - the inheritance
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/137
JENA-1172: restore support for blank nodes in jena-text
It seems that jena-text used to support blank nodes in the text index
(there is some infrastructure to support this, particularly
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/137#issuecomment-216165347
I made this into a PR rather than merging directly because I thought I may
have missed something obvious. But if it looks good to others, I will merge it.
---
If your
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/131#issuecomment-217813686
@joelkuiper Please create a new JIRA ticket for ComplexPhraseQueryParser
support on issues.apache.org, as this issue/PR is already closed.
It shouldn't be very hard
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-212280436
FWIW, I'm in favour of storing the result set before serialization, instead
of caching the serialized results as is apparently currently done. It makes
technical sense
Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-211773507
I have briefly tested Fuseki2 with this PR applied as a backend for our
software, Skosmos. I didn't find any problems. It seems to work for the basic
use case - repeated
Github user osma commented on the issue:
https://github.com/apache/jena/pull/205
@ajs6f I'm not sure whether you can use ORDER BY in a case like this. It
would need a SPARQL function for calculating distances between coordinates, so
you could say, in effect, "order by distance
Github user osma commented on the issue:
https://github.com/apache/jena/pull/205
@ajs6f It takes [a bit
more](http://www.movable-type.co.uk/scripts/latlong.html) than just sqrt() to
calculate great-circle distances between coordinates even within the same
coordinate system, e.g
Github user osma commented on the issue:
https://github.com/apache/jena/pull/205
Thanks @afs for the approval. I will wait a few more days and then merge
this, unless there are objections.
The only scenario I'm mildly concerned about is if someone performs a
spatial query
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/205
JENA-1277: don't use sorting in spatial queries, for much better performance
This PR proposes removing the `distSort` parameter from the Lucene spatial
query performed by jena-spatial. Dropping
Github user osma commented on the issue:
https://github.com/apache/jena/pull/227
@anujgandharv Ah, right, sorry, I didn't remember to check that case. Good
thing you got it working! I will proceed with merging #226 first and then let's
get on with merging this one.
---
If your
1 - 100 of 555 matches
Mail list logo