Puh, good! Unsolicited garbage collection (CAS rewriting) would break at least one of our projects. We use the CAS address as a FS-ID (as a substitute for that "id" feature that Georg asked for in another mail).
Btw. the desire for stable IDs seems to be pretty recurring recently… -- Richard Am 06.05.2013 um 15:28 schrieb Marshall Schor <[email protected]>: > yes, that's right. > > We currently only have serialization -> deserialization, or cas copying to > accomplish reclaiming space - it's like a stop-and-copy garbage collection. > > -Marshall > > On 5/6/2013 7:00 AM, Richard Eckart de Castilho (JIRA) wrote: >> [ >> https://issues.apache.org/jira/browse/UIMA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649647#comment-13649647 >> ] >> >> Richard Eckart de Castilho commented on UIMA-2434: >> -------------------------------------------------- >> >> About reclaiming space: this change is only affecting indexes, right? The >> addresses of FSes in the low-level CAS remain untouched? >> >>> Feature structure removal from sorted index is very slow >>> -------------------------------------------------------- >>> >>> Key: UIMA-2434 >>> URL: https://issues.apache.org/jira/browse/UIMA-2434 >>> Project: UIMA >>> Issue Type: Improvement >>> Components: Core Java Framework >>> Affects Versions: 2.3.1SDK >>> Reporter: Mikhail Sogrin >>> Assignee: Marshall Schor >>> Fix For: 2.4.1SDK >>> >>> >>> Removal of feature structures from sorted indexes (e.g. default index) is >>> very slow. FSIntArrayIndex.remove() method performs two operations: linear >>> search in the array until the given FS is found, followed by the shift of >>> elements to the end of this array by one position to the left. >>> If many annotations (millions and more) are being deleted at once, this >>> operation gets very very slow - much slower than adding these annotations >>> in the first place. It seems to require O(N^2) time to remove N annotations. >>> One item is the linear search, which can be replaced by the binary search >>> method, which is already implemented in the same class. >>> Second, array copy can be done with Java built-in method instead of a >>> custom loop. >>> Ideally, a method for bulk removal of a collection of annotations would >>> have been the most efficient, for example a method to remove all >>> annotations of a given type. >> -- >> This message is automatically generated by JIRA. >> If you think it was sent incorrectly, please contact your JIRA administrators >> For more information on JIRA, see: http://www.atlassian.com/software/jira >> >
