[jira] [Commented] (JENA-1908) TDB2: tdb2.tdbloader crashes

2020-06-14 Thread Jonas Sourlier (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135264#comment-17135264
 ] 

Jonas Sourlier commented on JENA-1908:
--

[~WolfgangFahl] I can upload a copy of the database if that helps.

> TDB2: tdb2.tdbloader crashes
> 
>
> Key: JENA-1908
> URL: https://issues.apache.org/jira/browse/JENA-1908
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB2
>Affects Versions: Jena 3.15.0
>Reporter: Wolfgang Fahl
>Priority: Major
> Attachments: tdb2--out.log
>
>
> [http://wiki.bitplan.com/index.php/Get_your_own_copy_of_WikiData#third_attempt_to_load_with_tdb2.tdbloader]
> describes a WikiData import attempt that ran some 14 days before it crashed.
>  
> {code:java}
> java.lang.IllegalArgumentException: null at 
> java.nio.Buffer.position(Buffer.java:244) ~[?:1.8.0_201] at 
> org.apache.jena.dboe.base.record.RecordFactory.lambda$static$0(RecordFactory.java:111)
>  ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.base.record.RecordFactory.buildFrom(RecordFactory.java:127)
>  ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.base.buffer.RecordBuffer._get(RecordBuffer.java:102) 
> ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.base.buffer.RecordBuffer.get(RecordBuffer.java:52) 
> ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeRecords.getSplitKey(BPTreeRecords.java:195)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.split(BPTreeNode.java:562) 
> ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:509)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.insert(BPTreeNode.java:203) 
> ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPlusTree.insertAndReturnOld(BPlusTree.java:278)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPlusTree.insert(BPlusTree.java:271) 
> ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.store.tupletable.TupleIndexRecord.performAdd(TupleIndexRecord.java:94)
>  ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.store.tupletable.TupleIndexBase.add(TupleIndexBase.java:66)
>  ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.loader.main.Indexer.lambda$loadTuples$1(Indexer.java:133)
>  ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.loader.main.Indexer.stageIndex(Indexer.java:115) 
> ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.loader.main.Indexer.lambda$startBulk$0(Indexer.java:92) 
> ~[jena-tdb2-3.15.0.jar:3.15.0] at java.lang.Thread.run(Thread.java:748) 
> [?:1.8.0_201] 14:58:42 ERROR Indexer :: Interrupted
>  
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JENA-1915) spatial:greatCircle appears to be returning wrong answers

2020-06-14 Thread Bryon Jacob (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135249#comment-17135249
 ] 

Bryon Jacob commented on JENA-1915:
---

Great - glad the tests were useful!

I'd imagine the differences are because I calculated using Haversine and the 
implementation in jena is Vincenty - both are measures of distance, but 
different algorithms that produce slightly different results...

> spatial:greatCircle appears to be returning wrong answers
> -
>
> Key: JENA-1915
> URL: https://issues.apache.org/jira/browse/JENA-1915
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Spatial
>Affects Versions: Jena 3.13.1
>Reporter: Bryon Jacob
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> the `spatial:greatCircle` function appears broken - sometimes returning 
> negative numbers.  returning negative numbers can never be correct from a 
> distance function - but I've tried to produce something more useful than that 
> as a defect report and repro...
> I was trying to "port" some SPARQL queries where I'd written the great circle 
> function (Haversine formula) to use the spatial:greatCircle function - which 
> means I had a pretty good test case to see where the issue arises.  I 
> simplified my query down to a repeatable test that demonstrates the issue:
> note that in my original query, I was using a custom jena function to compute 
> `radians` - since that function won't be available to you in stock jena to 
> try this out, I've precomputed the radians values and put them in the VALUES 
> block next to lat/lon pairs.  This should be runnable on any stock Jena 
> instance with the jena-geosparql functions loaded.  Note that for most of the 
> distances computed, the two distances agree quite closely - but for the last 
> two, the jena function returns a negative number, where the hand-computed 
> value is a correct distance between those points 
> {code:java}
> PREFIX m: 
> PREFIX spatial: 
> # this is a namespace of custom functions, which won't be available in stock 
> Jena - to reproduce
> # this, I've pre-computed the radian values and included them in a VALUES 
> block below...
> # PREFIX f: 
> SELECT ?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ ?d1 ?d2
>  WHERE {
>  VALUES (?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ) {
>  # in my original query, I was computing radians using a custom 
> radians function, which won't be available 
>  # in stock Jena, so to make this reproducible I precomputed the 
> radians values needed for the Haversine
> (41.2572  -95.965641.2592 -95.93390.7201  0.7201  
> 0.  0.0006)
> (41.2572  -95.965641.2482 -96.072 0.7201  0.7199  
> -0.0002 -0.0019)
> (41.2572  -95.965641.5871 -93.626 0.7201  0.7258  
> 0.0058  0.0408)
> (41.2572  -95.965651.0472 -113.9998   0.7201  0.8909  
> 0.1709  -0.3148)
> (41.2572  -95.965640.7528 -73.98760.7201  0.7113  
> -0.0088 0.3836)
> (41.2572  -95.965649.7237 13.3422 0.7201  0.8678  
> 0.1478  1.9078)
> (41.2572  -95.9656-33.906518.4175 0.7201  
> -0.5918 -1.3119 1.9964)
> (41.2572  -95.9656-33.8646151.20990.7201  
> -0.5910 -1.3111 4.3140)
>  }
> 
> # calculate the "great circle" distance between the two (lat φ, long λ) 
> points, in kilometers.
> # these are the function calls in my original query, commented out and 
> replaced with VALUES above
> # BIND (f:radians(?lat1) AS ?φ1)
> # BIND (f:radians(?lat2) AS ?φ2)
> # BIND (f:radians(?lat2 - ?lat1) AS ?Δφ)
> # BIND (f:radians(?lon2 - ?lon1) AS ?Δλ)
> BIND (m:sin(?Δφ / 2) * m:sin(?Δφ / 2) + m:cos(?φ1) * m:cos(?φ2) * 
> m:sin(?Δλ / 2) * m:sin(?Δλ / 2) AS ?a)
> BIND (2 * m:atan2(m:sqrt(?a), m:sqrt(1 - ?a)) AS ?c)
> BIND (6371 AS ?RadiusOfEarthInKm)
> BIND (?RadiusOfEarthInKm * ?c AS ?d1)
> # call the Jena function for comparison
> BIND(spatial:greatCircle(?lat1, ?lon1, ?lat2, ?lon2, 
> ) AS ?d2)
> }
> {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JENA-1915) spatial:greatCircle appears to be returning wrong answers

2020-06-14 Thread Greg Albiston (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135217#comment-17135217
 ] 

Greg Albiston commented on JENA-1915:
-

Hi Bryon,

Thanks for reporting this and the test cases. This has now been fixed on 
`master` branch with the test values you provided now being used as unit tests. 
There are still the minor discrepancies (less than 300m) which are likely due 
to rounding differences.

Thanks again,

Greg

> spatial:greatCircle appears to be returning wrong answers
> -
>
> Key: JENA-1915
> URL: https://issues.apache.org/jira/browse/JENA-1915
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Spatial
>Affects Versions: Jena 3.13.1
>Reporter: Bryon Jacob
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> the `spatial:greatCircle` function appears broken - sometimes returning 
> negative numbers.  returning negative numbers can never be correct from a 
> distance function - but I've tried to produce something more useful than that 
> as a defect report and repro...
> I was trying to "port" some SPARQL queries where I'd written the great circle 
> function (Haversine formula) to use the spatial:greatCircle function - which 
> means I had a pretty good test case to see where the issue arises.  I 
> simplified my query down to a repeatable test that demonstrates the issue:
> note that in my original query, I was using a custom jena function to compute 
> `radians` - since that function won't be available to you in stock jena to 
> try this out, I've precomputed the radians values and put them in the VALUES 
> block next to lat/lon pairs.  This should be runnable on any stock Jena 
> instance with the jena-geosparql functions loaded.  Note that for most of the 
> distances computed, the two distances agree quite closely - but for the last 
> two, the jena function returns a negative number, where the hand-computed 
> value is a correct distance between those points 
> {code:java}
> PREFIX m: 
> PREFIX spatial: 
> # this is a namespace of custom functions, which won't be available in stock 
> Jena - to reproduce
> # this, I've pre-computed the radian values and included them in a VALUES 
> block below...
> # PREFIX f: 
> SELECT ?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ ?d1 ?d2
>  WHERE {
>  VALUES (?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ) {
>  # in my original query, I was computing radians using a custom 
> radians function, which won't be available 
>  # in stock Jena, so to make this reproducible I precomputed the 
> radians values needed for the Haversine
> (41.2572  -95.965641.2592 -95.93390.7201  0.7201  
> 0.  0.0006)
> (41.2572  -95.965641.2482 -96.072 0.7201  0.7199  
> -0.0002 -0.0019)
> (41.2572  -95.965641.5871 -93.626 0.7201  0.7258  
> 0.0058  0.0408)
> (41.2572  -95.965651.0472 -113.9998   0.7201  0.8909  
> 0.1709  -0.3148)
> (41.2572  -95.965640.7528 -73.98760.7201  0.7113  
> -0.0088 0.3836)
> (41.2572  -95.965649.7237 13.3422 0.7201  0.8678  
> 0.1478  1.9078)
> (41.2572  -95.9656-33.906518.4175 0.7201  
> -0.5918 -1.3119 1.9964)
> (41.2572  -95.9656-33.8646151.20990.7201  
> -0.5910 -1.3111 4.3140)
>  }
> 
> # calculate the "great circle" distance between the two (lat φ, long λ) 
> points, in kilometers.
> # these are the function calls in my original query, commented out and 
> replaced with VALUES above
> # BIND (f:radians(?lat1) AS ?φ1)
> # BIND (f:radians(?lat2) AS ?φ2)
> # BIND (f:radians(?lat2 - ?lat1) AS ?Δφ)
> # BIND (f:radians(?lon2 - ?lon1) AS ?Δλ)
> BIND (m:sin(?Δφ / 2) * m:sin(?Δφ / 2) + m:cos(?φ1) * m:cos(?φ2) * 
> m:sin(?Δλ / 2) * m:sin(?Δλ / 2) AS ?a)
> BIND (2 * m:atan2(m:sqrt(?a), m:sqrt(1 - ?a)) AS ?c)
> BIND (6371 AS ?RadiusOfEarthInKm)
> BIND (?RadiusOfEarthInKm * ?c AS ?d1)
> # call the Jena function for comparison
> BIND(spatial:greatCircle(?lat1, ?lon1, ?lat2, ?lon2, 
> ) AS ?d2)
> }
> {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JENA-1915) spatial:greatCircle appears to be returning wrong answers

2020-06-14 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135214#comment-17135214
 ] 

ASF subversion and git services commented on JENA-1915:
---

Commit bb90563d978b72c6ea378d22f51dc44bb7ea8f01 in jena's branch 
refs/heads/master from Greg Albiston
[ https://gitbox.apache.org/repos/asf?p=jena.git;h=bb90563 ]

Fixed JENA-1915 and added additional test values.

> spatial:greatCircle appears to be returning wrong answers
> -
>
> Key: JENA-1915
> URL: https://issues.apache.org/jira/browse/JENA-1915
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Spatial
>Affects Versions: Jena 3.13.1
>Reporter: Bryon Jacob
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> the `spatial:greatCircle` function appears broken - sometimes returning 
> negative numbers.  returning negative numbers can never be correct from a 
> distance function - but I've tried to produce something more useful than that 
> as a defect report and repro...
> I was trying to "port" some SPARQL queries where I'd written the great circle 
> function (Haversine formula) to use the spatial:greatCircle function - which 
> means I had a pretty good test case to see where the issue arises.  I 
> simplified my query down to a repeatable test that demonstrates the issue:
> note that in my original query, I was using a custom jena function to compute 
> `radians` - since that function won't be available to you in stock jena to 
> try this out, I've precomputed the radians values and put them in the VALUES 
> block next to lat/lon pairs.  This should be runnable on any stock Jena 
> instance with the jena-geosparql functions loaded.  Note that for most of the 
> distances computed, the two distances agree quite closely - but for the last 
> two, the jena function returns a negative number, where the hand-computed 
> value is a correct distance between those points 
> {code:java}
> PREFIX m: 
> PREFIX spatial: 
> # this is a namespace of custom functions, which won't be available in stock 
> Jena - to reproduce
> # this, I've pre-computed the radian values and included them in a VALUES 
> block below...
> # PREFIX f: 
> SELECT ?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ ?d1 ?d2
>  WHERE {
>  VALUES (?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ) {
>  # in my original query, I was computing radians using a custom 
> radians function, which won't be available 
>  # in stock Jena, so to make this reproducible I precomputed the 
> radians values needed for the Haversine
> (41.2572  -95.965641.2592 -95.93390.7201  0.7201  
> 0.  0.0006)
> (41.2572  -95.965641.2482 -96.072 0.7201  0.7199  
> -0.0002 -0.0019)
> (41.2572  -95.965641.5871 -93.626 0.7201  0.7258  
> 0.0058  0.0408)
> (41.2572  -95.965651.0472 -113.9998   0.7201  0.8909  
> 0.1709  -0.3148)
> (41.2572  -95.965640.7528 -73.98760.7201  0.7113  
> -0.0088 0.3836)
> (41.2572  -95.965649.7237 13.3422 0.7201  0.8678  
> 0.1478  1.9078)
> (41.2572  -95.9656-33.906518.4175 0.7201  
> -0.5918 -1.3119 1.9964)
> (41.2572  -95.9656-33.8646151.20990.7201  
> -0.5910 -1.3111 4.3140)
>  }
> 
> # calculate the "great circle" distance between the two (lat φ, long λ) 
> points, in kilometers.
> # these are the function calls in my original query, commented out and 
> replaced with VALUES above
> # BIND (f:radians(?lat1) AS ?φ1)
> # BIND (f:radians(?lat2) AS ?φ2)
> # BIND (f:radians(?lat2 - ?lat1) AS ?Δφ)
> # BIND (f:radians(?lon2 - ?lon1) AS ?Δλ)
> BIND (m:sin(?Δφ / 2) * m:sin(?Δφ / 2) + m:cos(?φ1) * m:cos(?φ2) * 
> m:sin(?Δλ / 2) * m:sin(?Δλ / 2) AS ?a)
> BIND (2 * m:atan2(m:sqrt(?a), m:sqrt(1 - ?a)) AS ?c)
> BIND (6371 AS ?RadiusOfEarthInKm)
> BIND (?RadiusOfEarthInKm * ?c AS ?d1)
> # call the Jena function for comparison
> BIND(spatial:greatCircle(?lat1, ?lon1, ?lat2, ?lon2, 
> ) AS ?d2)
> }
> {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JENA-1915) spatial:greatCircle appears to be returning wrong answers

2020-06-14 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135215#comment-17135215
 ] 

ASF subversion and git services commented on JENA-1915:
---

Commit f7ec31cfda27b1e736e0fccbb2b75c506be6e335 in jena's branch 
refs/heads/master from GregAlbo
[ https://gitbox.apache.org/repos/asf?p=jena.git;h=f7ec31c ]

Merge pull request #756 from galbiston/great_circle_formula_fix

Fixed [JENA-1915] and added additional test values.

> spatial:greatCircle appears to be returning wrong answers
> -
>
> Key: JENA-1915
> URL: https://issues.apache.org/jira/browse/JENA-1915
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Spatial
>Affects Versions: Jena 3.13.1
>Reporter: Bryon Jacob
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> the `spatial:greatCircle` function appears broken - sometimes returning 
> negative numbers.  returning negative numbers can never be correct from a 
> distance function - but I've tried to produce something more useful than that 
> as a defect report and repro...
> I was trying to "port" some SPARQL queries where I'd written the great circle 
> function (Haversine formula) to use the spatial:greatCircle function - which 
> means I had a pretty good test case to see where the issue arises.  I 
> simplified my query down to a repeatable test that demonstrates the issue:
> note that in my original query, I was using a custom jena function to compute 
> `radians` - since that function won't be available to you in stock jena to 
> try this out, I've precomputed the radians values and put them in the VALUES 
> block next to lat/lon pairs.  This should be runnable on any stock Jena 
> instance with the jena-geosparql functions loaded.  Note that for most of the 
> distances computed, the two distances agree quite closely - but for the last 
> two, the jena function returns a negative number, where the hand-computed 
> value is a correct distance between those points 
> {code:java}
> PREFIX m: 
> PREFIX spatial: 
> # this is a namespace of custom functions, which won't be available in stock 
> Jena - to reproduce
> # this, I've pre-computed the radian values and included them in a VALUES 
> block below...
> # PREFIX f: 
> SELECT ?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ ?d1 ?d2
>  WHERE {
>  VALUES (?lat1 ?lon1 ?lat2 ?lon2 ?φ1 ?φ2 ?Δφ ?Δλ) {
>  # in my original query, I was computing radians using a custom 
> radians function, which won't be available 
>  # in stock Jena, so to make this reproducible I precomputed the 
> radians values needed for the Haversine
> (41.2572  -95.965641.2592 -95.93390.7201  0.7201  
> 0.  0.0006)
> (41.2572  -95.965641.2482 -96.072 0.7201  0.7199  
> -0.0002 -0.0019)
> (41.2572  -95.965641.5871 -93.626 0.7201  0.7258  
> 0.0058  0.0408)
> (41.2572  -95.965651.0472 -113.9998   0.7201  0.8909  
> 0.1709  -0.3148)
> (41.2572  -95.965640.7528 -73.98760.7201  0.7113  
> -0.0088 0.3836)
> (41.2572  -95.965649.7237 13.3422 0.7201  0.8678  
> 0.1478  1.9078)
> (41.2572  -95.9656-33.906518.4175 0.7201  
> -0.5918 -1.3119 1.9964)
> (41.2572  -95.9656-33.8646151.20990.7201  
> -0.5910 -1.3111 4.3140)
>  }
> 
> # calculate the "great circle" distance between the two (lat φ, long λ) 
> points, in kilometers.
> # these are the function calls in my original query, commented out and 
> replaced with VALUES above
> # BIND (f:radians(?lat1) AS ?φ1)
> # BIND (f:radians(?lat2) AS ?φ2)
> # BIND (f:radians(?lat2 - ?lat1) AS ?Δφ)
> # BIND (f:radians(?lon2 - ?lon1) AS ?Δλ)
> BIND (m:sin(?Δφ / 2) * m:sin(?Δφ / 2) + m:cos(?φ1) * m:cos(?φ2) * 
> m:sin(?Δλ / 2) * m:sin(?Δλ / 2) AS ?a)
> BIND (2 * m:atan2(m:sqrt(?a), m:sqrt(1 - ?a)) AS ?c)
> BIND (6371 AS ?RadiusOfEarthInKm)
> BIND (?RadiusOfEarthInKm * ?c AS ?d1)
> # call the Jena function for comparison
> BIND(spatial:greatCircle(?lat1, ?lon1, ?lat2, ?lon2, 
> ) AS ?d2)
> }
> {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JENA-1908) TDB2: tdb2.tdbloader crashes

2020-06-14 Thread Andy Seaborne (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135164#comment-17135164
 ] 

Andy Seaborne commented on JENA-1908:
-

The development builds:
https://jena.apache.org/download/maven.html

You could ask Jonas for a copy of his built database. Databases can be 
compressed (they compress very well) and can be copied around.


> TDB2: tdb2.tdbloader crashes
> 
>
> Key: JENA-1908
> URL: https://issues.apache.org/jira/browse/JENA-1908
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB2
>Affects Versions: Jena 3.15.0
>Reporter: Wolfgang Fahl
>Priority: Major
> Attachments: tdb2--out.log
>
>
> [http://wiki.bitplan.com/index.php/Get_your_own_copy_of_WikiData#third_attempt_to_load_with_tdb2.tdbloader]
> describes a WikiData import attempt that ran some 14 days before it crashed.
>  
> {code:java}
> java.lang.IllegalArgumentException: null at 
> java.nio.Buffer.position(Buffer.java:244) ~[?:1.8.0_201] at 
> org.apache.jena.dboe.base.record.RecordFactory.lambda$static$0(RecordFactory.java:111)
>  ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.base.record.RecordFactory.buildFrom(RecordFactory.java:127)
>  ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.base.buffer.RecordBuffer._get(RecordBuffer.java:102) 
> ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.base.buffer.RecordBuffer.get(RecordBuffer.java:52) 
> ~[jena-dboe-base-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeRecords.getSplitKey(BPTreeRecords.java:195)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.split(BPTreeNode.java:562) 
> ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:509)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPTreeNode.insert(BPTreeNode.java:203) 
> ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPlusTree.insertAndReturnOld(BPlusTree.java:278)
>  ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.dboe.trans.bplustree.BPlusTree.insert(BPlusTree.java:271) 
> ~[jena-dboe-trans-data-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.store.tupletable.TupleIndexRecord.performAdd(TupleIndexRecord.java:94)
>  ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.store.tupletable.TupleIndexBase.add(TupleIndexBase.java:66)
>  ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.loader.main.Indexer.lambda$loadTuples$1(Indexer.java:133)
>  ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.loader.main.Indexer.stageIndex(Indexer.java:115) 
> ~[jena-tdb2-3.15.0.jar:3.15.0] at 
> org.apache.jena.tdb2.loader.main.Indexer.lambda$startBulk$0(Indexer.java:92) 
> ~[jena-tdb2-3.15.0.jar:3.15.0] at java.lang.Thread.run(Thread.java:748) 
> [?:1.8.0_201] 14:58:42 ERROR Indexer :: Interrupted
>  
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JENA-1914) Allow editing own comments in JIRA

2020-06-14 Thread Andy Seaborne (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135161#comment-17135161
 ] 

Andy Seaborne commented on JENA-1914:
-

I've reopened 1909 and also edited the email footer info that was your replies. 
That doesn't make it go away everywhere - Jira is plugging into email so 
comments get sent outside the Jira installation.

> Allow editing own comments in JIRA
> --
>
> Key: JENA-1914
> URL: https://issues.apache.org/jira/browse/JENA-1914
> Project: Apache Jena
>  Issue Type: Task
>Reporter: Wolfgang Fahl
>Priority: Trivial
>
> My comments that where send from my-email account include details i'd like to 
> delete. It seems the right to "edit own comments" need to be activated for 
> that. see 
> [https://community.atlassian.com/t5/Jira-questions/Where-is-the-button-to-edit-a-comment-in-a-Jira-issue/qaq-p/326145.
>  
> |https://community.atlassian.com/t5/Jira-questions/Where-is-the-button-to-edit-a-comment-in-a-Jira-issue/qaq-p/326145]
>  I think this should be the default for all of the Apache Jena Ticket system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (JENA-1909) TDB1: tdbloader2 crashes

2020-06-14 Thread Andy Seaborne (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134145#comment-17134145
 ] 

Andy Seaborne edited comment on JENA-1909 at 6/14/20, 1:27 PM:
---

Jonas,

size and amount of SSD disks is also an important detail.

Wolfgang

 


was (Author: wfbitplan):
Jonas,

size and amount of SSD disks is also an important detail.

Wolfgang

-- 

BITPlan - smart solutions
Wolfgang Fahl
Pater-Delp-Str. 1, D-47877 Willich Schiefbahn
Tel. +49 2154 811-480, Fax +49 2154 811-481
Web: http://www.bitplan.de
BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548, 
Geschäftsführer: Wolfgang Fahl


> TDB1: tdbloader2 crashes
> 
>
> Key: JENA-1909
> URL: https://issues.apache.org/jira/browse/JENA-1909
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.15.0
>Reporter: Jonas Sourlier
>Priority: Major
> Attachments: signature.asc, signature.asc, tdb2.log
>
>
> This might be related to JENA-1908, but since the stack trace is different, I 
> opened a second ticket.
> Tried to import the latest Wikidata dump into Apache Jena, using the 
> following setup:
>  * Ubuntu 20.04 on Windows 10 Subsystem for Linux
>  * Apache Jena 3.15.0
>  * Intel i7 4770K, 32GB RAM
>  * 
> {code:java}
> openjdk 11.0.7 2020-04-14
> OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-3ubuntu1)
> OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-3ubuntu1, mixed mode, 
> sharing){code}
> These are the commands I have run:
> {code:java}
> wget -c 
> http://mirror.easyname.ch/apache/jena/binaries/apache-jena-3.15.0.tar.gz
> tar -xvzf apache-jena-3.15.0.tar.gz
> mkdir data
> apache-jena-3.15.0/bin/tdbloader2 --phase data --loc data/ ../latest-all.ttl 
> > tdb1.log 2> tdb2.log &
> apache-jena-3.15.0/bin/tdbloader2 --phase index --loc data/  > tdb1.log 2> 
> tdb2.log &
> {code}
> The data phase ran fine, but the index phase crashed after about 10 hours. 
> The stack trace is attached to this ticket (tdb2.log).
> Here's the standard output:
> {code:java}
>  08:47:57 INFO -- TDB Bulk Loader Start
>  08:47:57 INFO Index Building Phase
>  08:47:57 INFO Creating Index SPO
>  08:47:58 INFO Sort SPO
>  18:26:19 INFO Sort SPO Completed
>  18:26:19 INFO Build SPO
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (JENA-1909) TDB1: tdbloader2 crashes

2020-06-14 Thread Andy Seaborne (Jira)


 [ 
https://issues.apache.org/jira/browse/JENA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne reopened JENA-1909:
-

> TDB1: tdbloader2 crashes
> 
>
> Key: JENA-1909
> URL: https://issues.apache.org/jira/browse/JENA-1909
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.15.0
>Reporter: Jonas Sourlier
>Priority: Major
> Attachments: signature.asc, signature.asc, tdb2.log
>
>
> This might be related to JENA-1908, but since the stack trace is different, I 
> opened a second ticket.
> Tried to import the latest Wikidata dump into Apache Jena, using the 
> following setup:
>  * Ubuntu 20.04 on Windows 10 Subsystem for Linux
>  * Apache Jena 3.15.0
>  * Intel i7 4770K, 32GB RAM
>  * 
> {code:java}
> openjdk 11.0.7 2020-04-14
> OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-3ubuntu1)
> OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-3ubuntu1, mixed mode, 
> sharing){code}
> These are the commands I have run:
> {code:java}
> wget -c 
> http://mirror.easyname.ch/apache/jena/binaries/apache-jena-3.15.0.tar.gz
> tar -xvzf apache-jena-3.15.0.tar.gz
> mkdir data
> apache-jena-3.15.0/bin/tdbloader2 --phase data --loc data/ ../latest-all.ttl 
> > tdb1.log 2> tdb2.log &
> apache-jena-3.15.0/bin/tdbloader2 --phase index --loc data/  > tdb1.log 2> 
> tdb2.log &
> {code}
> The data phase ran fine, but the index phase crashed after about 10 hours. 
> The stack trace is attached to this ticket (tdb2.log).
> Here's the standard output:
> {code:java}
>  08:47:57 INFO -- TDB Bulk Loader Start
>  08:47:57 INFO Index Building Phase
>  08:47:57 INFO Creating Index SPO
>  08:47:58 INFO Sort SPO
>  18:26:19 INFO Sort SPO Completed
>  18:26:19 INFO Build SPO
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (JENA-1909) TDB1: tdbloader2 crashes

2020-06-14 Thread Andy Seaborne (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134142#comment-17134142
 ] 

Andy Seaborne edited comment on JENA-1909 at 6/14/20, 1:26 PM:
---

Hi Jonas,

could you please add more details - I have added your success to
 
[http://wiki.bitplan.com/index.php/Get_your_own_copy_of_WikiData#Performance_Reports]
 and at least would love to add the precise source imported from  and 
 the number of triples.

Also Hardware specs, operating system and Java virtual machine version
 would make things interesting.

Wolfgang

 


was (Author: wfbitplan):
Hi Jonas,

could you please add more details - I have added your success to
http://wiki.bitplan.com/index.php/Get_your_own_copy_of_WikiData#Performance_Reports
and at least would love to add the precise source imported from  and 
the number of triples.

Also Hardware specs, operating system and Java virtual machine version
would make things interesting.

Wolfgang


-- 

BITPlan - smart solutions
Wolfgang Fahl
Pater-Delp-Str. 1, D-47877 Willich Schiefbahn
Tel. +49 2154 811-480, Fax +49 2154 811-481
Web: http://www.bitplan.de
BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548, 
Geschäftsführer: Wolfgang Fahl


> TDB1: tdbloader2 crashes
> 
>
> Key: JENA-1909
> URL: https://issues.apache.org/jira/browse/JENA-1909
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.15.0
>Reporter: Jonas Sourlier
>Priority: Major
> Attachments: signature.asc, signature.asc, tdb2.log
>
>
> This might be related to JENA-1908, but since the stack trace is different, I 
> opened a second ticket.
> Tried to import the latest Wikidata dump into Apache Jena, using the 
> following setup:
>  * Ubuntu 20.04 on Windows 10 Subsystem for Linux
>  * Apache Jena 3.15.0
>  * Intel i7 4770K, 32GB RAM
>  * 
> {code:java}
> openjdk 11.0.7 2020-04-14
> OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-3ubuntu1)
> OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-3ubuntu1, mixed mode, 
> sharing){code}
> These are the commands I have run:
> {code:java}
> wget -c 
> http://mirror.easyname.ch/apache/jena/binaries/apache-jena-3.15.0.tar.gz
> tar -xvzf apache-jena-3.15.0.tar.gz
> mkdir data
> apache-jena-3.15.0/bin/tdbloader2 --phase data --loc data/ ../latest-all.ttl 
> > tdb1.log 2> tdb2.log &
> apache-jena-3.15.0/bin/tdbloader2 --phase index --loc data/  > tdb1.log 2> 
> tdb2.log &
> {code}
> The data phase ran fine, but the index phase crashed after about 10 hours. 
> The stack trace is attached to this ticket (tdb2.log).
> Here's the standard output:
> {code:java}
>  08:47:57 INFO -- TDB Bulk Loader Start
>  08:47:57 INFO Index Building Phase
>  08:47:57 INFO Creating Index SPO
>  08:47:58 INFO Sort SPO
>  18:26:19 INFO Sort SPO Completed
>  18:26:19 INFO Build SPO
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JENA-1909) TDB1: tdbloader2 crashes

2020-06-14 Thread Andy Seaborne (Jira)


[ 
https://issues.apache.org/jira/browse/JENA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135159#comment-17135159
 ] 

Andy Seaborne commented on JENA-1909:
-

That is really good to hear! 12B triples!

And hearing it is usable 

Jonas - could I ask one thing - could you email us...@jena.apache.org to 
announce this success?



The temporary files are candidates for being on disk, not SSD.  That would 
reduce the peak SSD needed.

In tdbloader2, the the first phase is better done on SSD. That's hard to avoid 
with the current TDB1 design (or TDB2 if a "tdb2.tdbloader2" were written).

The temporary files are more suitable for rotational disk which building. 

While the secondary indexes (after the first phase) are written in a 
disk-friendly fashion, when used, all the indexes benefit from SSD.

So there are important learning point here as well.

 

> TDB1: tdbloader2 crashes
> 
>
> Key: JENA-1909
> URL: https://issues.apache.org/jira/browse/JENA-1909
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.15.0
>Reporter: Jonas Sourlier
>Priority: Major
> Attachments: signature.asc, signature.asc, tdb2.log
>
>
> This might be related to JENA-1908, but since the stack trace is different, I 
> opened a second ticket.
> Tried to import the latest Wikidata dump into Apache Jena, using the 
> following setup:
>  * Ubuntu 20.04 on Windows 10 Subsystem for Linux
>  * Apache Jena 3.15.0
>  * Intel i7 4770K, 32GB RAM
>  * 
> {code:java}
> openjdk 11.0.7 2020-04-14
> OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-3ubuntu1)
> OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-3ubuntu1, mixed mode, 
> sharing){code}
> These are the commands I have run:
> {code:java}
> wget -c 
> http://mirror.easyname.ch/apache/jena/binaries/apache-jena-3.15.0.tar.gz
> tar -xvzf apache-jena-3.15.0.tar.gz
> mkdir data
> apache-jena-3.15.0/bin/tdbloader2 --phase data --loc data/ ../latest-all.ttl 
> > tdb1.log 2> tdb2.log &
> apache-jena-3.15.0/bin/tdbloader2 --phase index --loc data/  > tdb1.log 2> 
> tdb2.log &
> {code}
> The data phase ran fine, but the index phase crashed after about 10 hours. 
> The stack trace is attached to this ticket (tdb2.log).
> Here's the standard output:
> {code:java}
>  08:47:57 INFO -- TDB Bulk Loader Start
>  08:47:57 INFO Index Building Phase
>  08:47:57 INFO Creating Index SPO
>  08:47:58 INFO Sort SPO
>  18:26:19 INFO Sort SPO Completed
>  18:26:19 INFO Build SPO
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)