Re: [Zim-wiki] Time Stamped Text (TST) plugin
On Mon, Jun 10, 2013 at 12:38 AM, NorfCran norfc...@gmail.com wrote: Jaap, I am trying to figure out, what is the best storage solution for small changes, so far there are two suggestions: 1. single file with separated patches and integrated timestamps (in progress) 2. zip archive, which contains patches / files (name of the file represents timestamps) solution nr. 1: - index is not possible in this solution, only a chain of patches - special characters and following timestamps are used to separate the patches - it is easier to append new patches to end of the file - the file could be tracked by the main VCS (since it is text) solution nr. 2: - possibility to keep index file, which increases lookup of patches - timestamps are stored in single files in the zip file - there is involved compression of the timeline - it is easier to handle list all patches even in file browser Probably the best approach is the first version. Probably the most intensive calculation is to track the history of each piece of the current version. If needed we can cache that info in a seperate file. Let's go with solution 1. for now and make sure the code is flexible enough to change the storage format later if need be. Additionally I do have a question, whether it is fine to use other code, which is not licenced under GPL. Particularly it concerns suitable python module licensed under Apache License, Version 2.0: http://code.google.com/p/google-diff-match-patch/ No hard objection to use code with this license, although we should take care to keep licenses per module clear and not mix source files with different licenses. However might be easier to stick to the standard library diff module and not add additional dependencies if it can be avoided. Unless of course this module has some complex logic that is beyond the standard library? The attachments contain: 1. diff_match_patch.py (python code, which generates patches and reconstructs text) 2. page.timeline (proposed storage of changes with timestamps) 3. testing of concept.py (code, which utilizes diff_match_patch.py and serves as a prove of concept) JK Regards, Jaap ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Time Stamped Text (TST) plugin
On Tue, Jun 4, 2013 at 11:41 PM, NorfCran norfc...@gmail.com wrote: Yes, I do not want to replace the text files with XML (the concept based on TXT files is the most flexible in my opinion). You are right, attaching additional meta information in a shadow file is the only way. Since you have asked for use cases, I am going to model some of them separately for a notebook and for a single page, in order to illustrate utilization of Time Stamped Text (TST): *notebook* 1. search the most recently modified pages by a time range - hierarchical structures like wiki are dynamic and often changes happen on many pages, so why not to preserve this flow determined by time in its natural form? - ordinary search with many matching pages may be filtered out by a time range, which is almost possible only very generally based on ctime and mtime in the zim's database *page (main utilization of TST)* 1. highlighting up to date changes by smooth versions stored in TST data structure - most up to date changes are highlighted for instance by a red color and it fades to black (so it is easier to see, in case of modifications and revisions) 2. provides possibility to revert changes by performing undo and redo any time even though the text buffer is no longer available 3. time is a natural binder for any other activity performed in parallel to the note taking process → for instance searching information on the web (traceable from browser's history) One of my projects attached to the email researches a graph data-structure among other solutions capable of storing timestamps per word chunks (as a lowest granularity) separated by spaces. The data-structure is based on graphs and it has been implemented, but it is still not robust enough for all cases. Maybe it can bring some additional understanding and further direction of our discussion. OK, I would plan something like that as follow: 1/ Come up with a compact patch / diff like format that we can use to store small changes in a journal file next to the source file) 2/ Hook up a plugin to write such a patch file and update on each auto-save action 3/ Add an API to the plugin such that we can a/ query timestamp for a specific piece of text in a specific file b/ can request timestamps for each part of the current version of the file c/ request previous / next change for a given file 4/ Extend the search function to use API part a and add a column to the search dialog -- fulfills first use case 5/ Extend the page view to use API part b to highlight recent changes 6/ Hook up the undo/redo-manager a/ to use API part c to extend undo in the past b/ send data to the plugin per change as they happen, instead of waiting for auto-save -- fulfills 2nd use case Probably the quickest result for a first result would be if you look into step 1-3a and I hack in step 4. If that is working I'm willing to help with steps 5 6 to integrate into the editor widget. Regards, Jaap ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Time Stamped Text (TST) plugin
On Sat, Jun 1, 2013 at 6:47 PM, NorfCran norfc...@gmail.com wrote: Dear Japp, thank you for your suggestion, I already started experimenting with difflib library, which is capable of generating deltas. Concerning the data structure with timestamps, it would be possibly worth to consider the following protocol called Gobby, which does provide a way how to collaborate over network on a one to many text files. On top of this feature it also defines data structure, which may accommodate timestamps. Do you think that the protocol cold be integrated into ZIM, since it uses GTK? Possibly it may elevate ZIM like a personal wiki to real-time collaborative writing? The APIhttp://gobby.0x539.de/trac/wiki/APIReferenceoffers libinftextgtk, but I am not certain about complexity resulting from intended integration. Even though the libinftextgtk is implement in C it seems to be possible to wrap the C implementation and use it in Python code according to the following link: http://stackoverflow.com/questions/1942298/wrapping-a-c-library-in-python-c-cython-or-ctypes The infinote protocol uses storage in the following form: ?xml version=1.0? inf-text-session user id=1 name=norfcran_apple hue=0.203069/ user id=2 name=norfcran hue=0.628897996/ buffer segment author=1asdfasdfasd fa sdf as tell df as df as df /segment segment author=2asdfa sdf as/segment segment author=1 this may be wrong/segment segment author=2 df as d f/segment /buffer /inf-text-session The segment may be extended by timestamps. So it results in timestamped text, which does not preserve history of changes, but on the other hand it brings a real-time collaboration on a single file. Additionally the timestamps could be utilized for tracking changes over many pages, since time is a natural binder of flow, when there are more than one page edited simultaneously. Hope that these suggestions do not turn it into something impossible, so far at least I can see potentially a feasible shortcut to bring another organizational tool in form of timestamps. Thank you in advance for your opinion, best regards, JK Assuming you do not propose to replace the wiki text files with xml files, only way using this I see is as a shadow file that sits next to the actual source file. But whether or not that is useful and desirable depends highly on what you want to do with the data. Depending on the use case, a different representation may be more efficient. The thing to realize is that the API you refer has it's own document management, so it may not be compatible with how zim stores documents in the notebook. Would need to dive in much deeper to understand the technical implications. So what is the use case / user functionality that you want to build ? Is it about synchronisation, about timestamping each and every change to the sources, both ? What use interface would you want to support? Then we can answer what technology is needed to support that kind of feature. Regards, Jaap ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Time Stamped Text (TST) plugin
Dear Japp, thank you for your suggestion, I already started experimenting with difflib library, which is capable of generating deltas. Concerning the data structure with timestamps, it would be possibly worth to consider the following protocol called Gobby, which does provide a way how to collaborate over network on a one to many text files. On top of this feature it also defines data structure, which may accommodate timestamps. Do you think that the protocol cold be integrated into ZIM, since it uses GTK? Possibly it may elevate ZIM like a personal wiki to real-time collaborative writing? The APIhttp://gobby.0x539.de/trac/wiki/APIReferenceoffers libinftextgtk, but I am not certain about complexity resulting from intended integration. Even though the libinftextgtk is implement in C it seems to be possible to wrap the C implementation and use it in Python code according to the following link: http://stackoverflow.com/questions/1942298/wrapping-a-c-library-in-python-c-cython-or-ctypes The infinote protocol uses storage in the following form: ?xml version=1.0? inf-text-session user id=1 name=norfcran_apple hue=0.203069/ user id=2 name=norfcran hue=0.628897996/ buffer segment author=1asdfasdfasd fa sdf as tell df as df as df /segment segment author=2asdfa sdf as/segment segment author=1 this may be wrong/segment segment author=2 df as d f/segment /buffer /inf-text-session The segment may be extended by timestamps. So it results in timestamped text, which does not preserve history of changes, but on the other hand it brings a real-time collaboration on a single file. Additionally the timestamps could be utilized for tracking changes over many pages, since time is a natural binder of flow, when there are more than one page edited simultaneously. Hope that these suggestions do not turn it into something impossible, so far at least I can see potentially a feasible shortcut to bring another organizational tool in form of timestamps. Thank you in advance for your opinion, best regards, JK On 31 May 2013 13:04, Jaap Karssenberg jaap.karssenb...@gmail.com wrote: JK, Main problem I see is how you going to store all that meta-data in a wiki format. If you really want to timestamp a change of e.g. 2 words half way a paragraph you end up with timestamps every other word in your source text. So you would have to keep a file next to the actual source to track changes as they happen. Kind of keeping a permanent record of the undo stack. Not too hard to hack together if you trigger it to update on each auto-save. Bonus is that you would also get permanent undo. Only technical tid-bit is that our real undo-stack is in terms of positions in the text buffer, which does not match positions in the source text, so some glue is needed there. Alternative would be to store patches and figure out history from that. Most version control system have an annotated mode to show history of text, but those are usually per line, not per word. You could figure out though history per word from the version history. You would have to commit for every other change though, so probably not for your purpose. So in conclusion: 1/ Write a plugin that takes a diff of the text in the source file on each auto-save and stores the deltas timestamped in a record next to the actual source file. 2/ Connect it to the undo stack, so even after closing a page, you can still undo/redo each delta 3/ Figure out what representation of this data you would want in the user interface - e.g. text annotation, change log, ... Regards, Jaap On Fri, May 31, 2013 at 10:53 AM, NorfCran norfc...@gmail.com wrote: Hi Jaap and other contributors, it has been some time, since I worked on a project, which researched capabilities of synchronizing text with time (to extent of timestamped words). Actually I wanted to bring this feature to ZIM, but could not get there (the project in my case tries to use tree data structure algorithm to solve this issue with respect to timestamps). Anyhow, recently I came up to the following application, which inspired me to write this email: https://itunes.apple.com/de/app/armadillo-audio-notes/id532223938?mt=12 It basically provides, what I would personally love to see in ZIM as well (apart of the audio, that is another level). Can you see any possibility to target this feature? I am personally very much into idea of time based text and possibly other users may start to see advantages of it (so there would be chance to track changes based on time through hierarchy of pages, especially when some words in long paragraphs change, it is difficult to use VCS). This is actually the main inspiration of this feature: http://etherpad.org/ I would be interested in your opinions whether you see some possibility in implementing time based text plugin (basically directions of further focus)? Thank you for your feedback, best regards, JK