Weak Performance of "application/json+rdf" serializer on big TripleCollections and Serialzer/Parser using Platform encoding instead of UTF-8 --------------------------------------------------------------------------------------------------------------------------------------------
Key: CLEREZZA-643 URL: https://issues.apache.org/jira/browse/CLEREZZA-643 Project: Clerezza Issue Type: Improvement Reporter: Rupert Westenthaler Both the "application/json+rdf" serializer and parser use platform specific encodings instead of UTF-8. In addition the serializer suffers from very poor performance on big graphs (at least when using SimpleMGrpah) After some digging in the Code I came to the conclusion that this is because of the use of multiple TripleCollection.filter(..) calls fist to filter all predicates for an subject and than all objects for each subject/predicate combination. A trying to serialize a graph with 50k triples ended in several minutes 100% CPU. With the next comment I will provide a patch with an implementation based on a sorted array of the triples. With this method one can serialize graphs with 100k in about 1sec. This patch also changes encoding to UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira