[jira] [Commented] (JENA-804) Jena is not reusing already allocated space on the file system which results in large amounts of disk space reserved by Jena files

2015-07-15 Thread Trevor Donaldson (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628525#comment-14628525
 ] 

Trevor Donaldson commented on JENA-804:
---

I agree with Keith. We are unable to use Jena because of this issue. We often 
delete a graph and replace the data inside of a graph. Doing this a few times 
caused us to reach critical mass in our storage very quickly. 

 Jena is not reusing already allocated space on the file system which results 
 in large amounts of disk space reserved by Jena files
 --

 Key: JENA-804
 URL: https://issues.apache.org/jira/browse/JENA-804
 Project: Apache Jena
  Issue Type: Bug
  Components: Jena
Affects Versions: Jena 2.11.2, TDB 1.0.2
 Environment: Windows 7, IBM JRE 1.7, Tomcat 7.0.54
Reporter: Keith Wells
 Attachments: TdbGrowthTests.java, out.txt, test-tdb-size.sh


 We have a product based on Jena TDB where we insert quads to Jena TDB along 
 with the deletion of quads.  We understand the performance over space 
 architectural decision to not clean up deleted nodeids from the indexes. But 
 the usage of disk space appears that Jena TDB is not reusing allocated space 
 which had been allocated by Jena previously.  Based on this comment there 
 appears to be something that is not correct on file space utilization, 
 http://mail-archives.apache.org/mod_mbox/jena-users/201310.mbox/%3cce7d7929.2a707%25rve...@dotnetrdf.org%3E:
  The indexes won't shrink - TDB never gives disk space back to the OS -  but 
 disk space is reused when reallocated within the same JVM..
 In this scenario on the same JVM with NO server stops or starts, we add 27765 
 graphs to IndexTdb and immediately remove them,  repeating this process 
 several times. 
 {noformat}
  MB   Bytes   Diff (Bytes)
 Start   193   203239424   
   
 Reindex 5 249 262066176   58826752
 Reindex 6 249 262086656   20480
 Reindex 10298 312500224   50413568
 Reindex 11298 312520704   20480
 Reindex 12298 312541184   20480
 Reindex 13298 312586240   45056
 Reindex 14306 320995328   8409088
 Reindex 15330 346181632   25186304
 Reindex 16330 346198538   16906
 Reindex 17346 362999808   16801270
 Reindex 18346 363020288   20480
 Reindex 19346 363040768   20480
 Reindex 20346 363061248   20480
 Reindex 21346 363081728   20480
 Reindex 22354 371490816   8409088
 Reindex 23378 396677120   25186304
   
 End   193 203239424   
 {noformat}
 The system starts with 193MB of data allocated by indexTdb.  A reindex 
 consists of a remove followed by an add of these graphs. As you can see from 
 the data there is a dramatic increase in the size of indexTdb on the disk 
 after repeadedly removing and adding graphs.  After Reindex 23, there is 378 
 MB of disk space used.  If Jena TDB reused allocated space there would be no 
 need to allocate more space other than what is used by deleted node ids 
 (unless nodeid storage is eating all of this space?).  Jena does not appear 
 to be reusing the allocated disk space.  At the very end of this scenario, we 
 exported the nquads and reloaded them to show the original disk space was 
 193MB back to where it started. 
 We believe Jena TDB is not reusing the space allocated by the TDB file system 
 within the same JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JENA-979) add a fuseki admin service to list all existing backups

2015-07-15 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne reassigned JENA-979:
--

Assignee: Andy Seaborne

 add a fuseki admin service to list all existing backups
 ---

 Key: JENA-979
 URL: https://issues.apache.org/jira/browse/JENA-979
 Project: Apache Jena
  Issue Type: New Feature
  Components: Fuseki
Affects Versions: Fuseki 2.0.0, Fuseki 2.0.1, Fuseki 2.3.0
Reporter: Yang Yuanzhe
Assignee: Andy Seaborne
Priority: Minor

 Add a fuseki admin service to list all existing backups.
 Service URL: $/backupList
 Response example:
 { 
   backups : [ 
   ds_2015-06-08_04-23-02.nq.gz ,
   ds_2015-06-08_07-57-10.nq.gz ,
   ds_2015-06-08_07-57-48.nq.gz ,
   ds_2015-06-09_03-46-37.nq.gz ,
   ds_2015-06-17_12-20-31.nq ,
   ds_2015-06-17_12-20-31.nq.gz ,
   ds_2015-06-23_10-53-47.nq.gz
 ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] jena pull request: Deprecating and removing some iterators from je...

2015-07-15 Thread afs
Github user afs commented on the pull request:

https://github.com/apache/jena/pull/80#issuecomment-121589199
  
Hi - could you try closing this pull request from your end? It seems to 
have got stuck in some way, whereby the commit controls don't have any effect. 
It's not the only one; the same happened to one of my PRs and the way I sorted 
it was from the requester end. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] jena pull request: Deprecating and removing some iterators from je...

2015-07-15 Thread ajs6f
Github user ajs6f closed the pull request at:

https://github.com/apache/jena/pull/80


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (JENA-804) Jena is not reusing already allocated space on the file system which results in large amounts of disk space reserved by Jena files

2015-07-15 Thread Trevor Donaldson (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628525#comment-14628525
 ] 

Trevor Donaldson edited comment on JENA-804 at 7/15/15 7:37 PM:


I agree with Keith. My team and my customer is unable to use Jena because of 
this issue. We often delete a graph and replace the data inside of a graph. 
Doing this a few times caused us to reach critical mass in our storage very 
quickly. 


was (Author: tmdonalds):
I agree with Keith. We are unable to use Jena because of this issue. We often 
delete a graph and replace the data inside of a graph. Doing this a few times 
caused us to reach critical mass in our storage very quickly. 

 Jena is not reusing already allocated space on the file system which results 
 in large amounts of disk space reserved by Jena files
 --

 Key: JENA-804
 URL: https://issues.apache.org/jira/browse/JENA-804
 Project: Apache Jena
  Issue Type: Bug
  Components: Jena
Affects Versions: Jena 2.11.2, TDB 1.0.2
 Environment: Windows 7, IBM JRE 1.7, Tomcat 7.0.54
Reporter: Keith Wells
 Attachments: TdbGrowthTests.java, out.txt, test-tdb-size.sh


 We have a product based on Jena TDB where we insert quads to Jena TDB along 
 with the deletion of quads.  We understand the performance over space 
 architectural decision to not clean up deleted nodeids from the indexes. But 
 the usage of disk space appears that Jena TDB is not reusing allocated space 
 which had been allocated by Jena previously.  Based on this comment there 
 appears to be something that is not correct on file space utilization, 
 http://mail-archives.apache.org/mod_mbox/jena-users/201310.mbox/%3cce7d7929.2a707%25rve...@dotnetrdf.org%3E:
  The indexes won't shrink - TDB never gives disk space back to the OS -  but 
 disk space is reused when reallocated within the same JVM..
 In this scenario on the same JVM with NO server stops or starts, we add 27765 
 graphs to IndexTdb and immediately remove them,  repeating this process 
 several times. 
 {noformat}
  MB   Bytes   Diff (Bytes)
 Start   193   203239424   
   
 Reindex 5 249 262066176   58826752
 Reindex 6 249 262086656   20480
 Reindex 10298 312500224   50413568
 Reindex 11298 312520704   20480
 Reindex 12298 312541184   20480
 Reindex 13298 312586240   45056
 Reindex 14306 320995328   8409088
 Reindex 15330 346181632   25186304
 Reindex 16330 346198538   16906
 Reindex 17346 362999808   16801270
 Reindex 18346 363020288   20480
 Reindex 19346 363040768   20480
 Reindex 20346 363061248   20480
 Reindex 21346 363081728   20480
 Reindex 22354 371490816   8409088
 Reindex 23378 396677120   25186304
   
 End   193 203239424   
 {noformat}
 The system starts with 193MB of data allocated by indexTdb.  A reindex 
 consists of a remove followed by an add of these graphs. As you can see from 
 the data there is a dramatic increase in the size of indexTdb on the disk 
 after repeadedly removing and adding graphs.  After Reindex 23, there is 378 
 MB of disk space used.  If Jena TDB reused allocated space there would be no 
 need to allocate more space other than what is used by deleted node ids 
 (unless nodeid storage is eating all of this space?).  Jena does not appear 
 to be reusing the allocated disk space.  At the very end of this scenario, we 
 exported the nquads and reloaded them to show the original disk space was 
 193MB back to where it started. 
 We believe Jena TDB is not reusing the space allocated by the TDB file system 
 within the same JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JENA-979) add a fuseki admin service to list all existing backups

2015-07-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628685#comment-14628685
 ] 

ASF GitHub Bot commented on JENA-979:
-

Github user asfgit closed the pull request at:

https://github.com/apache/jena/pull/82


 add a fuseki admin service to list all existing backups
 ---

 Key: JENA-979
 URL: https://issues.apache.org/jira/browse/JENA-979
 Project: Apache Jena
  Issue Type: New Feature
  Components: Fuseki
Affects Versions: Fuseki 2.0.0, Fuseki 2.0.1, Fuseki 2.3.0
Reporter: Yang Yuanzhe
Assignee: Andy Seaborne

 Add a fuseki admin service to list all existing backups.
 Service URL: $/backupList
 Response example:
 { 
   backups : [ 
   ds_2015-06-08_04-23-02.nq.gz ,
   ds_2015-06-08_07-57-10.nq.gz ,
   ds_2015-06-08_07-57-48.nq.gz ,
   ds_2015-06-09_03-46-37.nq.gz ,
   ds_2015-06-17_12-20-31.nq ,
   ds_2015-06-17_12-20-31.nq.gz ,
   ds_2015-06-23_10-53-47.nq.gz
 ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] jena pull request: JENA-979: add a fuseki admin service to list al...

2015-07-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/jena/pull/82


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (JENA-979) add a fuseki admin service to list all existing backups

2015-07-15 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-979.

   Resolution: Fixed
Fix Version/s: Fuseki 2.3.0

 add a fuseki admin service to list all existing backups
 ---

 Key: JENA-979
 URL: https://issues.apache.org/jira/browse/JENA-979
 Project: Apache Jena
  Issue Type: New Feature
  Components: Fuseki
Affects Versions: Fuseki 2.0.0, Fuseki 2.0.1, Fuseki 2.3.0
Reporter: Yang Yuanzhe
Assignee: Andy Seaborne
 Fix For: Fuseki 2.3.0


 Add a fuseki admin service to list all existing backups.
 Service URL: $/backupList
 Response example:
 { 
   backups : [ 
   ds_2015-06-08_04-23-02.nq.gz ,
   ds_2015-06-08_07-57-10.nq.gz ,
   ds_2015-06-08_07-57-48.nq.gz ,
   ds_2015-06-09_03-46-37.nq.gz ,
   ds_2015-06-17_12-20-31.nq ,
   ds_2015-06-17_12-20-31.nq.gz ,
   ds_2015-06-23_10-53-47.nq.gz
 ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JENA-966) LazyIterator

2015-07-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627974#comment-14627974
 ] 

ASF GitHub Bot commented on JENA-966:
-

Github user ajs6f closed the pull request at:

https://github.com/apache/jena/pull/80


 LazyIterator
 

 Key: JENA-966
 URL: https://issues.apache.org/jira/browse/JENA-966
 Project: Apache Jena
  Issue Type: Bug
  Components: Core
Affects Versions: Jena 3.0.0
Reporter: Claude Warren
Assignee: Claude Warren
 Fix For: Jena 3.0.0


 LazyIterator is an abstract class.  The documentation indicates that the 
 create() method needs to be overridden to create an instance.  From this I 
 would expect that 
 now LazyIterator(){
 @Override
 public ExtendedIteratorModel create() {
   ...
 }};
 Would work however LazyIterator does not override:
 remoteNext(), andThen(), toList(), and toSet().
 I believe these should be implemented in the class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] jena pull request: Deprecating and removing some iterators from je...

2015-07-15 Thread afs
Github user afs commented on the pull request:

https://github.com/apache/jena/pull/80#issuecomment-121606159
  
That unstuck the PR - thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (JENA-979) add a fuseki admin service to list all existing backups

2015-07-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628204#comment-14628204
 ] 

ASF GitHub Bot commented on JENA-979:
-

Github user afs commented on the pull request:

https://github.com/apache/jena/pull/82#issuecomment-121646813
  
WIP: I'm incorporating this by integrating into the admin action framework 
(i.e.  `extends ActionCtl`)

Currently:

* The URL is `/$/backup-list`  Camel-case is less common for URL API naming 
(YMMV).
* Returns the file names, not the full path names. The server does not 
reveal its absolute path setup.
* Returns non-hidden files : Filters on `!file.isHidden()  file.isFile()`
* Response is `Content-Type: application/json`




 add a fuseki admin service to list all existing backups
 ---

 Key: JENA-979
 URL: https://issues.apache.org/jira/browse/JENA-979
 Project: Apache Jena
  Issue Type: New Feature
  Components: Fuseki
Affects Versions: Fuseki 2.0.0, Fuseki 2.0.1, Fuseki 2.3.0
Reporter: Yang Yuanzhe
Assignee: Andy Seaborne

 Add a fuseki admin service to list all existing backups.
 Service URL: $/backupList
 Response example:
 { 
   backups : [ 
   ds_2015-06-08_04-23-02.nq.gz ,
   ds_2015-06-08_07-57-10.nq.gz ,
   ds_2015-06-08_07-57-48.nq.gz ,
   ds_2015-06-09_03-46-37.nq.gz ,
   ds_2015-06-17_12-20-31.nq ,
   ds_2015-06-17_12-20-31.nq.gz ,
   ds_2015-06-23_10-53-47.nq.gz
 ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)