[GitHub] jena issue #428: Update OSGi imports

2018-06-08 Thread acoburn
Github user acoburn commented on the issue:

https://github.com/apache/jena/pull/428
  
FYI, I have submitted a related PR to `jsonld-java`: 
https://github.com/jsonld-java/jsonld-java/pull/234

Once that change makes it into a `jsonld-java` release, it should be 
possible to remove the explicit dependency on Guava in the `features.xml` file.


---


[GitHub] jena pull request #428: Update OSGi imports

2018-06-08 Thread afs
Github user afs commented on a diff in the pull request:

https://github.com/apache/jena/pull/428#discussion_r194137833
  
--- Diff: 
apache-jena-osgi/jena-osgi-features/src/main/resources/features.xml ---
@@ -21,28 +21,24 @@


mvn:org.apache.jena/jena-osgi/${project.version}
jena_osgi_dependencies
-   
-   

 

-   
mvn:com.github.andrewoma.dexx/collection/${ver.dexxcollection}
-   
mvn:com.github.jsonld-java/jsonld-java/${ver.jsonldjava}
-   
mvn:com.fasterxml.jackson.core/jackson-core/${ver.jackson}
-   
mvn:com.fasterxml.jackson.core/jackson-databind/${ver.jackson}
-   
mvn:com.fasterxml.jackson.core/jackson-annotations/${ver.jackson}
-   
mvn:org.apache.httpcomponents/httpcore-osgi/${ver.httpcore-osgi}
-   
mvn:org.apache.httpcomponents/httpclient-osgi/${ver.httpclient-osgi}
-   
mvn:org.apache.commons/commons-csv/${ver.commonscsv}
-   
mvn:org.apache.commons/commons-lang3/${ver.commonslang3}
-   
mvn:commons-codec/commons-codec/${ver.commons-codec}
-   mvn:commons-io/commons-io/${ver.commonsio}
-   
mvn:org.apache.thrift/libthrift/${ver.libthrift}   
-   
-
-   
-   
mvn:org.apache.servicemix.bundles/org.apache.servicemix.bundles.xerces/2.11.0_1
-   
mvn:org.apache.servicemix.bundles/org.apache.servicemix.bundles.xmlresolver/1.2_5
+   mvn:com.github.andrewoma.dexx/collection/${ver.dexxcollection}
+   mvn:com.github.jsonld-java/jsonld-java/${ver.jsonldjava}
+   mvn:com.fasterxml.jackson.core/jackson-core/${ver.jackson}
+   mvn:com.fasterxml.jackson.core/jackson-databind/${ver.jackson}
+   mvn:com.fasterxml.jackson.core/jackson-annotations/${ver.jackson}
+   mvn:org.apache.httpcomponents/httpcore-osgi/${ver.httpcore-osgi}
+   mvn:org.apache.httpcomponents/httpclient-osgi/${ver.httpclient-osgi}
+   mvn:org.apache.commons/commons-compress/${ver.commons-compress}
+   mvn:org.apache.commons/commons-csv/${ver.commonscsv}
+   mvn:org.apache.commons/commons-lang3/${ver.commonslang3}
+   mvn:commons-codec/commons-codec/${ver.commons-codec}
+   mvn:commons-io/commons-io/${ver.commonsio}
+   mvn:org.apache.thrift/libthrift/${ver.libthrift}
+   
+   mvn:com.google.guava/guava/24.1-jre

--- End diff --

Ah - OK - that explains it. All looks to go.


---


Re: Issue with both lang and datatype being set in AWS Neptune (JSON select results)

2018-06-08 Thread Andy Seaborne




On 08/06/18 18:09, Philip Coates wrote:

Hello,

I’ve run into a situation where it looks like AWS Neptune (possibly other
stores too) is returning both xml:lang and the datatype for String literals.

It turns out that Jena doesn’t like this in JSON SPARQL results (it's fine
with XML, cf JENA-1077) - QueryEngineHTTP pulls the results through a JSON
processor, org.apache.jena.riot.resultset.rw.ResultSetReaderJSON, and this
throws an error internally when it tries to process the output, e.g.:

{
   "xml:lang" : "en" ,
   "datatype" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#langString; ,
   "type" : "literal" ,
   "value" : "phototaxis"
}

This can easily be worked around by setting the content-type for the select
results, but it seems like a bug that it doesn't work the same with every
content type.


Do they do the same for xsd:string?


I couldn't find many other references to this sort of issue in the archives
or documentation, so I was proposing to raise a bug, and create a simple
patch for this to allow it for the string and langString datatype in JSON.


Please go ahead with the ticket and PR.

It isn't supposed to include datatype for rdf:langString nor xsd:string 
though the main text for that is in the main RDF doc where is says in 
general don't include them. Jena ought to be lax about it and certainly 
consistent.


Thanks in advance,
Andy



Does anyone have any thoughts or caveats?

*Philip Coates*
*Director*

email: philip.coa...@semanticintegration.co.uk
tel: +44 (0)7711 818384

Registered in England and Wales. Company number: 08688966.
Registered office: First Floor, Telecom House, 125-135 Preston Road,
Brighton, England, BN1 6AF




[GitHub] jena pull request #428: Update OSGi imports

2018-06-08 Thread acoburn
Github user acoburn commented on a diff in the pull request:

https://github.com/apache/jena/pull/428#discussion_r194126560
  
--- Diff: 
apache-jena-osgi/jena-osgi-features/src/main/resources/features.xml ---
@@ -21,28 +21,24 @@


mvn:org.apache.jena/jena-osgi/${project.version}
jena_osgi_dependencies
-   
-   

 

-   
mvn:com.github.andrewoma.dexx/collection/${ver.dexxcollection}
-   
mvn:com.github.jsonld-java/jsonld-java/${ver.jsonldjava}
-   
mvn:com.fasterxml.jackson.core/jackson-core/${ver.jackson}
-   
mvn:com.fasterxml.jackson.core/jackson-databind/${ver.jackson}
-   
mvn:com.fasterxml.jackson.core/jackson-annotations/${ver.jackson}
-   
mvn:org.apache.httpcomponents/httpcore-osgi/${ver.httpcore-osgi}
-   
mvn:org.apache.httpcomponents/httpclient-osgi/${ver.httpclient-osgi}
-   
mvn:org.apache.commons/commons-csv/${ver.commonscsv}
-   
mvn:org.apache.commons/commons-lang3/${ver.commonslang3}
-   
mvn:commons-codec/commons-codec/${ver.commons-codec}
-   mvn:commons-io/commons-io/${ver.commonsio}
-   
mvn:org.apache.thrift/libthrift/${ver.libthrift}   
-   
-
-   
-   
mvn:org.apache.servicemix.bundles/org.apache.servicemix.bundles.xerces/2.11.0_1
-   
mvn:org.apache.servicemix.bundles/org.apache.servicemix.bundles.xmlresolver/1.2_5
+   mvn:com.github.andrewoma.dexx/collection/${ver.dexxcollection}
+   mvn:com.github.jsonld-java/jsonld-java/${ver.jsonldjava}
+   mvn:com.fasterxml.jackson.core/jackson-core/${ver.jackson}
+   mvn:com.fasterxml.jackson.core/jackson-databind/${ver.jackson}
+   mvn:com.fasterxml.jackson.core/jackson-annotations/${ver.jackson}
+   mvn:org.apache.httpcomponents/httpcore-osgi/${ver.httpcore-osgi}
+   mvn:org.apache.httpcomponents/httpclient-osgi/${ver.httpclient-osgi}
+   mvn:org.apache.commons/commons-compress/${ver.commons-compress}
+   mvn:org.apache.commons/commons-csv/${ver.commonscsv}
+   mvn:org.apache.commons/commons-lang3/${ver.commonslang3}
+   mvn:commons-codec/commons-codec/${ver.commons-codec}
+   mvn:commons-io/commons-io/${ver.commonsio}
+   mvn:org.apache.thrift/libthrift/${ver.libthrift}
+   
+   mvn:com.google.guava/guava/24.1-jre

--- End diff --

You are correct that jsonld-java does shade guava, but their OSGi bundle 
still imports the `com.google.common` packages: this is from the jsonld-java 
OSGi metadata:

```
Import-Package
  ...
  com.google.common.cache{version=[24.1,25)}
  com.google.common.collect  {version=[24.1,25)}
```

And so in practice, adding guava as a dependency is still necessary. I 
think this is an error with the jsonld-java OSGi metadata, and I will bring 
this up with that project.

For example, when trying to load just the `jsonld-java` bundle in Karaf -- 
once I add the Jackson dependencies -- I get this error if guava is not also 
installed:

```
Unable to resolve com.github.jsonld-java [46](R 46.0): missing requirement 
[com.github.jsonld-java [46](R 46.0)] osgi.wiring.package; 
(&(osgi.wiring.package=com.google.common.cache)(version>=24.1.0)(!(version>=25.0.0)))
```

...which means that the guava packages need to be installed. Once guava is 
added, the jsonld-java bundle installs and starts just fine.


---


[GitHub] jena issue #428: Update OSGi imports

2018-06-08 Thread afs
Github user afs commented on the issue:

https://github.com/apache/jena/pull/428
  
Thanks for trying out  PR #429 - it is now merged and there is a 
development build with it in. All I can think of for the 
`com.google.errorprone.annotations` and `org.checkerframework.checker` packages 
is that the OSGi build is taking from the repos but that's nothing more than 
speculation.




---


[GitHub] jena pull request #428: Update OSGi imports

2018-06-08 Thread afs
Github user afs commented on a diff in the pull request:

https://github.com/apache/jena/pull/428#discussion_r194124193
  
--- Diff: 
apache-jena-osgi/jena-osgi-features/src/main/resources/features.xml ---
@@ -21,28 +21,24 @@


mvn:org.apache.jena/jena-osgi/${project.version}
jena_osgi_dependencies
-   
-   

 

-   
mvn:com.github.andrewoma.dexx/collection/${ver.dexxcollection}
-   
mvn:com.github.jsonld-java/jsonld-java/${ver.jsonldjava}
-   
mvn:com.fasterxml.jackson.core/jackson-core/${ver.jackson}
-   
mvn:com.fasterxml.jackson.core/jackson-databind/${ver.jackson}
-   
mvn:com.fasterxml.jackson.core/jackson-annotations/${ver.jackson}
-   
mvn:org.apache.httpcomponents/httpcore-osgi/${ver.httpcore-osgi}
-   
mvn:org.apache.httpcomponents/httpclient-osgi/${ver.httpclient-osgi}
-   
mvn:org.apache.commons/commons-csv/${ver.commonscsv}
-   
mvn:org.apache.commons/commons-lang3/${ver.commonslang3}
-   
mvn:commons-codec/commons-codec/${ver.commons-codec}
-   mvn:commons-io/commons-io/${ver.commonsio}
-   
mvn:org.apache.thrift/libthrift/${ver.libthrift}   
-   
-
-   
-   
mvn:org.apache.servicemix.bundles/org.apache.servicemix.bundles.xerces/2.11.0_1
-   
mvn:org.apache.servicemix.bundles/org.apache.servicemix.bundles.xmlresolver/1.2_5
+   mvn:com.github.andrewoma.dexx/collection/${ver.dexxcollection}
+   mvn:com.github.jsonld-java/jsonld-java/${ver.jsonldjava}
+   mvn:com.fasterxml.jackson.core/jackson-core/${ver.jackson}
+   mvn:com.fasterxml.jackson.core/jackson-databind/${ver.jackson}
+   mvn:com.fasterxml.jackson.core/jackson-annotations/${ver.jackson}
+   mvn:org.apache.httpcomponents/httpcore-osgi/${ver.httpcore-osgi}
+   mvn:org.apache.httpcomponents/httpclient-osgi/${ver.httpclient-osgi}
+   mvn:org.apache.commons/commons-compress/${ver.commons-compress}
+   mvn:org.apache.commons/commons-csv/${ver.commonscsv}
+   mvn:org.apache.commons/commons-lang3/${ver.commonslang3}
+   mvn:commons-codec/commons-codec/${ver.commons-codec}
+   mvn:commons-io/commons-io/${ver.commonsio}
+   mvn:org.apache.thrift/libthrift/${ver.libthrift}
+   
+   mvn:com.google.guava/guava/24.1-jre

--- End diff --

I don't think this is true any more. At version 0.12.0, they shade guava 
and put in their jar so it does not need com.google.guava:guava. At least, 
that's the theory! If you have found that in practice it is needed, that has 
more weight.


---


Issue with both lang and datatype being set in AWS Neptune (JSON select results)

2018-06-08 Thread Philip Coates
Hello,

I’ve run into a situation where it looks like AWS Neptune (possibly other
stores too) is returning both xml:lang and the datatype for String literals.

It turns out that Jena doesn’t like this in JSON SPARQL results (it's fine
with XML, cf JENA-1077) - QueryEngineHTTP pulls the results through a JSON
processor, org.apache.jena.riot.resultset.rw.ResultSetReaderJSON, and this
throws an error internally when it tries to process the output, e.g.:

{
  "xml:lang" : "en" ,
  "datatype" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#langString; ,
  "type" : "literal" ,
  "value" : "phototaxis"
}

This can easily be worked around by setting the content-type for the select
results, but it seems like a bug that it doesn't work the same with every
content type.

I couldn't find many other references to this sort of issue in the archives
or documentation, so I was proposing to raise a bug, and create a simple
patch for this to allow it for the string and langString datatype in JSON.

Does anyone have any thoughts or caveats?

*Philip Coates*
*Director*

email: philip.coa...@semanticintegration.co.uk
tel: +44 (0)7711 818384

Registered in England and Wales. Company number: 08688966.
Registered office: First Floor, Telecom House, 125-135 Preston Road,
Brighton, England, BN1 6AF



[jira] [Resolved] (JENA-1558) Ensure that shading the Guava dependency does not transitively include Guava.

2018-06-08 Thread Andy Seaborne (JIRA)


 [ 
https://issues.apache.org/jira/browse/JENA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-1558.
-
   Resolution: Fixed
Fix Version/s: Jena 3.8.0

> Ensure that shading the Guava dependency does not transitively include Guava.
> -
>
> Key: JENA-1558
> URL: https://issues.apache.org/jira/browse/JENA-1558
> Project: Apache Jena
>  Issue Type: Task
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Major
> Fix For: Jena 3.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (JENA-1556) text:query multilingual enhancements

2018-06-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506119#comment-16506119
 ] 

ASF GitHub Bot commented on JENA-1556:
--

Github user xristy commented on the issue:

https://github.com/apache/jena/pull/430
  
Yes the changes are only in jena-text. I needed a 3.7.0 w/ the 
enhancements. I’ll try to set up with 3.8.0-Snapshot. 

Chris

> On Jun 8, 2018, at 09:58, Andy Seaborne  wrote:
> 
> Hi - this PR has a lot of changes for 3.7.0-SNAPSHOT to 3.7.0: all POMs, 
TDB1, TDB2, the GroupBy changes and more. I'm a bit worried this will undo 
unrelated changesl; its a lot of files to unpick.
> 
> Would it be possible to have the PR as changes to the current development 
3.8.0-SNAPSHOT?
> 
> (I'm guessing the real changes are in 
jena-text:org.apache.jena.query.text, and nothing outside this area?)
> 
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub, or mute the thread.



> text:query multilingual enhancements
> 
>
> Key: JENA-1556
> URL: https://issues.apache.org/jira/browse/JENA-1556
> Project: Apache Jena
>  Issue Type: New Feature
>  Components: Text
>Affects Versions: Jena 3.7.0
>Reporter: Code Ferret
>Assignee: Code Ferret
>Priority: Major
>  Labels: pull-request-available
>
> This issue proposes two related enhancements of Jena Text. These enhancements 
> have been implemented and a PR can be issued. 
> There are two multilingual search situations that we want to support:
>  # We want to be able to search in one encoding and retrieve results that may 
> have been entered in other encodings. For example, searching via Simplified 
> Chinese (Hans) and retrieving results that may have been entered in 
> Traditional Chinese (Hant) or Pinyin. This will simplify applications by 
> permitting encoding independent retrieval without additional layers of 
> transcoding and so on. It's all done under the covers in Lucene.
>  # We want to search with queries entered in a lossy, e.g., phonetic, 
> encoding and retrieve results entered with accurate encoding. For example, 
> searching vis Pinyin without diacritics and retrieving all possible Hans and 
> Hant triples.
> The first situation arises when entering triples that include languages with 
> multiple encodings that for various reasons are not normalized to a single 
> encoding. In this situation we want to be able to retrieve appropriate result 
> sets without regard for the encodings used at the time that the triples were 
> inserted into the dataset.
> There are several such languages of interest in our application: Chinese, 
> Tibetan, Sanskrit, Japanese and Korean. There are various Romanizations and 
> ideographic variants.
> Encodings may not normalized when inserting triples for a variety of reasons. 
> A principle one is that the {{rdf:langString}} object often must be entered 
> in the same encoding that it occurs in some physical text that is being 
> catalogued. Another is that metadata may be imported from sources that use 
> different encoding conventions and we want to preserve that form.
> The second situation arises as we want to provide simple support for phonetic 
> or other forms of lossy search at the time that triples are indexed directly 
> in the Lucene system.
> To handle the first situation we introduce a {{text}} assembler predicate, 
> {{text:searchFor}}, that specifies a list of language tags that provides a 
> list of language variants that should be searched whenever a query string of 
> a given encoding (language tag) is used. For example, the following 
> {{text:TextIndexLucene/text:defineAnalyzers}} fragment :
> {code:java}
> [ text:addLang "bo" ; 
>   text:searchFor ( "bo" "bo-x-ewts" "bo-alalc97" ) ;
>   text:analyzer [ 
> a text:GenericAnalyzer ;
> text:class "io.bdrc.lucene.bo.TibetanAnalyzer" ;
> text:params (
> [ text:paramName "segmentInWords" ;
>   text:paramValue false ]
> [ text:paramName "lemmatize" ;
>   text:paramValue true ]
> [ text:paramName "filterChars" ;
>   text:paramValue false ]
> [ text:paramName "inputMode" ;
>   text:paramValue "unicode" ]
> [ text:paramName "stopFilename" ;
>   text:paramValue "" ]
> )
> ] ; 
>   ]
> {code}
> indicates that when using a search string such as "རྡོ་རྗེ་སྙིང་"@bo the 
> Lucene index should also be searched for matches tagged as {{bo-x-ewts}} and 
> {{bo-alalc97}}.
> This is made possible by a Tibetan {{Analyzer}} that tokenizes strings in all 

[GitHub] jena issue #430: JENA-1556 text:query multilingual enhancements

2018-06-08 Thread xristy
Github user xristy commented on the issue:

https://github.com/apache/jena/pull/430
  
Yes the changes are only in jena-text. I needed a 3.7.0 w/ the 
enhancements. I’ll try to set up with 3.8.0-Snapshot. 

Chris

> On Jun 8, 2018, at 09:58, Andy Seaborne  wrote:
> 
> Hi - this PR has a lot of changes for 3.7.0-SNAPSHOT to 3.7.0: all POMs, 
TDB1, TDB2, the GroupBy changes and more. I'm a bit worried this will undo 
unrelated changesl; its a lot of files to unpick.
> 
> Would it be possible to have the PR as changes to the current development 
3.8.0-SNAPSHOT?
> 
> (I'm guessing the real changes are in 
jena-text:org.apache.jena.query.text, and nothing outside this area?)
> 
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub, or mute the thread.



---


[jira] [Commented] (JENA-1556) text:query multilingual enhancements

2018-06-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506110#comment-16506110
 ] 

ASF GitHub Bot commented on JENA-1556:
--

Github user afs commented on the issue:

https://github.com/apache/jena/pull/430
  
Hi - this PR has a lot of changes for 3.7.0-SNAPSHOT to 3.7.0: all POMs, 
TDB1, TDB2, the GroupBy changes and more. I'm a bit worried this will undo 
unrelated changesl; its a lot of files to unpick.

Would it be possible to have the PR as changes to the current development 
3.8.0-SNAPSHOT?

(I'm guessing the real changes are in jena-text:org.apache.jena.query.text, 
and nothing outside this area?)


> text:query multilingual enhancements
> 
>
> Key: JENA-1556
> URL: https://issues.apache.org/jira/browse/JENA-1556
> Project: Apache Jena
>  Issue Type: New Feature
>  Components: Text
>Affects Versions: Jena 3.7.0
>Reporter: Code Ferret
>Assignee: Code Ferret
>Priority: Major
>  Labels: pull-request-available
>
> This issue proposes two related enhancements of Jena Text. These enhancements 
> have been implemented and a PR can be issued. 
> There are two multilingual search situations that we want to support:
>  # We want to be able to search in one encoding and retrieve results that may 
> have been entered in other encodings. For example, searching via Simplified 
> Chinese (Hans) and retrieving results that may have been entered in 
> Traditional Chinese (Hant) or Pinyin. This will simplify applications by 
> permitting encoding independent retrieval without additional layers of 
> transcoding and so on. It's all done under the covers in Lucene.
>  # We want to search with queries entered in a lossy, e.g., phonetic, 
> encoding and retrieve results entered with accurate encoding. For example, 
> searching vis Pinyin without diacritics and retrieving all possible Hans and 
> Hant triples.
> The first situation arises when entering triples that include languages with 
> multiple encodings that for various reasons are not normalized to a single 
> encoding. In this situation we want to be able to retrieve appropriate result 
> sets without regard for the encodings used at the time that the triples were 
> inserted into the dataset.
> There are several such languages of interest in our application: Chinese, 
> Tibetan, Sanskrit, Japanese and Korean. There are various Romanizations and 
> ideographic variants.
> Encodings may not normalized when inserting triples for a variety of reasons. 
> A principle one is that the {{rdf:langString}} object often must be entered 
> in the same encoding that it occurs in some physical text that is being 
> catalogued. Another is that metadata may be imported from sources that use 
> different encoding conventions and we want to preserve that form.
> The second situation arises as we want to provide simple support for phonetic 
> or other forms of lossy search at the time that triples are indexed directly 
> in the Lucene system.
> To handle the first situation we introduce a {{text}} assembler predicate, 
> {{text:searchFor}}, that specifies a list of language tags that provides a 
> list of language variants that should be searched whenever a query string of 
> a given encoding (language tag) is used. For example, the following 
> {{text:TextIndexLucene/text:defineAnalyzers}} fragment :
> {code:java}
> [ text:addLang "bo" ; 
>   text:searchFor ( "bo" "bo-x-ewts" "bo-alalc97" ) ;
>   text:analyzer [ 
> a text:GenericAnalyzer ;
> text:class "io.bdrc.lucene.bo.TibetanAnalyzer" ;
> text:params (
> [ text:paramName "segmentInWords" ;
>   text:paramValue false ]
> [ text:paramName "lemmatize" ;
>   text:paramValue true ]
> [ text:paramName "filterChars" ;
>   text:paramValue false ]
> [ text:paramName "inputMode" ;
>   text:paramValue "unicode" ]
> [ text:paramName "stopFilename" ;
>   text:paramValue "" ]
> )
> ] ; 
>   ]
> {code}
> indicates that when using a search string such as "རྡོ་རྗེ་སྙིང་"@bo the 
> Lucene index should also be searched for matches tagged as {{bo-x-ewts}} and 
> {{bo-alalc97}}.
> This is made possible by a Tibetan {{Analyzer}} that tokenizes strings in all 
> three encodings into Tibetan Unicode. This is feasible since the 
> {{bo-x-ewts}} and {{bo-alalc97}} encodings are one-to-one with Unicode 
> Tibetan. Since all fields with these language tags will have a common set of 
> indexed terms, i.e., Tibetan Unicode, it suffices to arrange for the query 
> analyzer to have access to the language tag for the query string 

[GitHub] jena issue #430: JENA-1556 text:query multilingual enhancements

2018-06-08 Thread afs
Github user afs commented on the issue:

https://github.com/apache/jena/pull/430
  
Hi - this PR has a lot of changes for 3.7.0-SNAPSHOT to 3.7.0: all POMs, 
TDB1, TDB2, the GroupBy changes and more. I'm a bit worried this will undo 
unrelated changesl; its a lot of files to unpick.

Would it be possible to have the PR as changes to the current development 
3.8.0-SNAPSHOT?

(I'm guessing the real changes are in jena-text:org.apache.jena.query.text, 
and nothing outside this area?)


---


[jira] [Commented] (JENA-1558) Ensure that shading the Guava dependency does not transitively include Guava.

2018-06-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506073#comment-16506073
 ] 

ASF subversion and git services commented on JENA-1558:
---

Commit 2459b077714cc36b7dec780bb1907f2af0ba73a7 in jena's branch 
refs/heads/master from [~an...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=2459b07 ]

JENA-1558: Make Guava optional and add exclusions


> Ensure that shading the Guava dependency does not transitively include Guava.
> -
>
> Key: JENA-1558
> URL: https://issues.apache.org/jira/browse/JENA-1558
> Project: Apache Jena
>  Issue Type: Task
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (JENA-1558) Ensure that shading the Guava dependency does not transitively include Guava.

2018-06-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506075#comment-16506075
 ] 

ASF GitHub Bot commented on JENA-1558:
--

Github user asfgit closed the pull request at:

https://github.com/apache/jena/pull/429


> Ensure that shading the Guava dependency does not transitively include Guava.
> -
>
> Key: JENA-1558
> URL: https://issues.apache.org/jira/browse/JENA-1558
> Project: Apache Jena
>  Issue Type: Task
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jena pull request #429: JENA-1558: Make Guava optional and add exclusions

2018-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/jena/pull/429


---


[jira] [Commented] (JENA-1558) Ensure that shading the Guava dependency does not transitively include Guava.

2018-06-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506074#comment-16506074
 ] 

ASF subversion and git services commented on JENA-1558:
---

Commit 30d093414ad8e7b9282e9cc890af5960ea3d41f8 in jena's branch 
refs/heads/master from [~an...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=30d0934 ]

JENA-1558: Merge commit 'refs/pull/429/head' of https://github.com/apache/jena

This closes #429.


> Ensure that shading the Guava dependency does not transitively include Guava.
> -
>
> Key: JENA-1558
> URL: https://issues.apache.org/jira/browse/JENA-1558
> Project: Apache Jena
>  Issue Type: Task
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jena issue #428: Update OSGi imports

2018-06-08 Thread acoburn
Github user acoburn commented on the issue:

https://github.com/apache/jena/pull/428
  
I built Jena with your change in #429 along with the changes from this PR, 
and for some reason the `jena-osgi` module still wants to import the 
`com.google.errorprone.annotations` and `org.checkerframework.checker` packages 
-- unless, of course, they are explicitly excluded, as in this PR. So, unless 
there is some other way to achieve this, I think it will be necessary to 
explicitly exclude those guava-related transitive dependencies.

In the second commit, I removed the `xerces` feature. I also added the 
`dependency="true"` flag to all of the dependencies. Background on the 
`dependency="true"` attribute is available in the [Karaf 
documentation](https://karaf.apache.org/manual/latest/provisioning).

I tested this in two ways. First, simply provisioning the `jena` feature in 
Karaf 4.2.0 works via `feature:install jena`. I also was able to provision and 
run a complete OSGi-based CXF application that makes use of Jena.

One further change that could be made to the `features.xml` file is to 
remove the `jena_osgi_dependencies` feature, and instead put the content of 
that directly into the `jena` feature definition. That would result in 
`jena-osgi-features` making only a single feature available. That would 
simplify the file a bit, but that change also isn't strictly necessary.


---


[jira] [Commented] (JENA-1556) text:query multilingual enhancements

2018-06-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506003#comment-16506003
 ] 

ASF GitHub Bot commented on JENA-1556:
--

GitHub user xristy opened a pull request:

https://github.com/apache/jena/pull/430

JENA-1556 text:query multilingual enhancements

implements proposed enhancements in JENA-1556

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BuddhistDigitalResourceCenter/jena 
JENA-1556-MutilingualEnhancements

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/jena/pull/430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #430


commit 8a99de5aa808b99fbfdb45eb56dfddadebd064d7
Author: Chris Tomlinson 
Date:   2018-06-08T13:06:09Z

Merged Search-with dev branch




> text:query multilingual enhancements
> 
>
> Key: JENA-1556
> URL: https://issues.apache.org/jira/browse/JENA-1556
> Project: Apache Jena
>  Issue Type: New Feature
>  Components: Text
>Affects Versions: Jena 3.7.0
>Reporter: Code Ferret
>Assignee: Code Ferret
>Priority: Major
>  Labels: pull-request-available
>
> This issue proposes two related enhancements of Jena Text. These enhancements 
> have been implemented and a PR can be issued. 
> There are two multilingual search situations that we want to support:
>  # We want to be able to search in one encoding and retrieve results that may 
> have been entered in other encodings. For example, searching via Simplified 
> Chinese (Hans) and retrieving results that may have been entered in 
> Traditional Chinese (Hant) or Pinyin. This will simplify applications by 
> permitting encoding independent retrieval without additional layers of 
> transcoding and so on. It's all done under the covers in Lucene.
>  # We want to search with queries entered in a lossy, e.g., phonetic, 
> encoding and retrieve results entered with accurate encoding. For example, 
> searching vis Pinyin without diacritics and retrieving all possible Hans and 
> Hant triples.
> The first situation arises when entering triples that include languages with 
> multiple encodings that for various reasons are not normalized to a single 
> encoding. In this situation we want to be able to retrieve appropriate result 
> sets without regard for the encodings used at the time that the triples were 
> inserted into the dataset.
> There are several such languages of interest in our application: Chinese, 
> Tibetan, Sanskrit, Japanese and Korean. There are various Romanizations and 
> ideographic variants.
> Encodings may not normalized when inserting triples for a variety of reasons. 
> A principle one is that the {{rdf:langString}} object often must be entered 
> in the same encoding that it occurs in some physical text that is being 
> catalogued. Another is that metadata may be imported from sources that use 
> different encoding conventions and we want to preserve that form.
> The second situation arises as we want to provide simple support for phonetic 
> or other forms of lossy search at the time that triples are indexed directly 
> in the Lucene system.
> To handle the first situation we introduce a {{text}} assembler predicate, 
> {{text:searchFor}}, that specifies a list of language tags that provides a 
> list of language variants that should be searched whenever a query string of 
> a given encoding (language tag) is used. For example, the following 
> {{text:TextIndexLucene/text:defineAnalyzers}} fragment :
> {code:java}
> [ text:addLang "bo" ; 
>   text:searchFor ( "bo" "bo-x-ewts" "bo-alalc97" ) ;
>   text:analyzer [ 
> a text:GenericAnalyzer ;
> text:class "io.bdrc.lucene.bo.TibetanAnalyzer" ;
> text:params (
> [ text:paramName "segmentInWords" ;
>   text:paramValue false ]
> [ text:paramName "lemmatize" ;
>   text:paramValue true ]
> [ text:paramName "filterChars" ;
>   text:paramValue false ]
> [ text:paramName "inputMode" ;
>   text:paramValue "unicode" ]
> [ text:paramName "stopFilename" ;
>   text:paramValue "" ]
> )
> ] ; 
>   ]
> {code}
> indicates that when using a search string such as "རྡོ་རྗེ་སྙིང་"@bo the 
> Lucene index should also be searched for matches tagged as {{bo-x-ewts}} and 
> {{bo-alalc97}}.
> This is made possible by a Tibetan {{Analyzer}} that tokenizes strings in all 
> three encodings into Tibetan Unicode. This is feasible since the 
> {{bo-x-ewts}} and 

[GitHub] jena pull request #430: JENA-1556 text:query multilingual enhancements

2018-06-08 Thread xristy
GitHub user xristy opened a pull request:

https://github.com/apache/jena/pull/430

JENA-1556 text:query multilingual enhancements

implements proposed enhancements in JENA-1556

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BuddhistDigitalResourceCenter/jena 
JENA-1556-MutilingualEnhancements

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/jena/pull/430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #430


commit 8a99de5aa808b99fbfdb45eb56dfddadebd064d7
Author: Chris Tomlinson 
Date:   2018-06-08T13:06:09Z

Merged Search-with dev branch




---