Re: [Geany-devel] tagmanager changes
On 12-05-07 08:32 PM, Lex Trotman wrote: On 8 May 2012 13:27, Matthew Brushmbr...@codebrainz.ca wrote: On 12-05-07 05:03 PM, Lex Trotman wrote: On 8 May 2012 02:04, Nick Treleavennick.trelea...@btinternet.comwrote: On 02/05/2012 05:46, Lex Trotman wrote: 4. Ctags parsers Agree with Nick that the parsers are usable, but if we start modifying them to handle local declarations then they will be totally incompatible with the Ctags project so I guess it doesn't matter other than for getting languages we don't currently parse. ctags c.c already parses local tags No it doesn't AFAICT: I'm guessing he's referring to the upstream CTags c.c, which does have a l kind for local variables (off by default). See `ctags --list-kinds=C`. I'm not sure if the Geany fork has this, was forked before it was added, or if the guy that wrote TM took it out. Ok, I havn't looked at Ctags c.c because IIUC from other comments it isn't really mergable with our c.c. Does upstream c.c use tagmanager, and if so how does it structure scopes? (A good exercise for your compiler :) 1) no 1a) I never could understand CTags scopes, something like a 2-byte value, maybe an index into some array? Cheers, Matthew Brush ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Re: [Geany-devel] tagmanager changes
Le 07/05/2012 18:04, Nick Treleaven a écrit : On 02/05/2012 05:46, Lex Trotman wrote: Hi All, To summarise since the thread has several subthreads. 1. Tagmanager Understandability a. I generated the doxygen documentation for tagmanager, it works if you set recursive, but didn't help much: - if its not OOP why does it say things like TMWorkspace is derived from TMWorkObject and similar? documentation bug IMO I don't think so. TM uses a more or less OOP-like approach. See for example TMWorkspace: typedef struct { TMWorkObject work_object; /*! The parent work object */ GPtrArray *global_tags; /*! Global tags loaded at startup */ GPtrArray *work_objects; /*! An array of TMWorkObject pointers */ } TMWorkspace; The first field (work_object) is the inherited class, here TMWorkObject. And you'll see numerous places where the code uses such a derived structure as a TMWorkObject -- since it is one actually --, which looks quite like OOP. Or see tm_workspace.c:44:tm_create_workspace(): it uses tm_work_object_register() to register itself as a new type of work object with a few methods (or vfuncs), and the initializes iself with tm_work_object_init(), etc. I very well understand Lex's questionings about how it does actually work, since it brings a second OOP-style programming in C, less well known than GObject -- though of course less complex also, but still (BTW maybe porting to GObject could help?) - its not clear how it all goes together, the workspace contains global tags and work_objects, or is that files and whats the workspace work objects are document tags. global tags explained in geany's manual. difference between source_file and file_entry? It doesn't look like tm_file_entry_ is really used. - similarly whats the difference between symbol and tag? tm_symbol_ doesn't seem to be used. 2. Ability to expand tagmanager to handle names declared in lexical scopes (not to be confused with struct/class scopes). Here is the example again with some numbers so I can refer to them { struct a o; struct a p; o./* struct a members */ { struct b o; o./* struct b members */ p./* struct a members */ } o./* struct a members */ p./* struct a members1 */ { struct c o; o./* struct c members */ p./* struct a members2 */ } o./* struct a members */ p./* struct a members */ } a. yep, tries use more memory than an array, the usual speed/space, pick one, tradeoff :) b. @Nick, when you say sort by scope then name, are you wanting to have an entry in the table for each declaration of the name? no - If so this makes the array much bigger to search and your search speed depends on size, and it doesn't get you anything, you can't search by scope since you don't know if the name is declared in the scope you are in or an outer scope compare p at1 and2 - having a single name array which then points to scope info for the name is a viable approach (disclosure, thats how I'm doing the symbol table for a language I'm developing) but the table being searched is usually larger than if you have nested arrays. Being smaller these are faster to search if the search isn't O(1), hence the suggestion of trie instead of bsearch. the gain in simplicity makes a bigger array to search worth it. Remember, global tags aren't included in the workspace array of tagmanager, so we're not talking a big number of tags, and we have o(log n) searching. 4. Ctags parsers Agree with Nick that the parsers are usable, but if we start modifying them to handle local declarations then they will be totally incompatible with the Ctags project so I guess it doesn't matter other than for getting languages we don't currently parse. ctags c.c already parses local tags 5. Overloaded symbols Since Colombans patch, overloaded symbols are now stable for all practical code (I think theoretically it could get confused if the overloads are on the same line but thats unlikely enough to ignore for human generated code) If you're talking about master, I think I still experienced wrong parenting on reparsing when removing lines. 6. Moving functionality from symbols.c to tagmanager a. Since its the 100th anniversary of the Titanic sinking, I think that shuffling the deckchairs is an apt analogy, the functionality has to be somewhere, its only useful to move it if the destination significantly reduces the effort required. I don't think I suggested moving functionality. I wondered whether TM could help make symbols.c less complex. I would need to understand the complexity to know whether this is appropriate or not. Well, what symbols.c tries to do when updating the symbols tree is (as documented above update_tree_tags() BTW): 1) update tags that still exist and remove obsolescent ones 2) add the remaining tags (new ones) to the tree The implementation is a (tiny) little (bit) more
Re: [Geany-devel] tagmanager changes
Le 02/05/2012 06:46, Lex Trotman a écrit : [...] 3. Background/asynchronous whatever you want to call it parsing a. @Colomban, you say that caches are created on first lookup, doesn't that throw work back into the UI thread which could have been done in the parsing thread? Well, yes, the UI thread gets the load (unless there is background/async lookups too :)). However in my approach there can be any number of caches [1], and those caches can't really be guessed, so... but maybe it's simply a bad approach. Or the used caches could be hard-coded in CTM so they never get created implicitly -- quite ugly but works). [1] cache means: a array of tags sorted by a given sort function. Each cache is then used for a particular type of lookups: * simple completions (sort by name); * scope completion (uses more than one cache, sorted by scope and name, and sorted by type); * etc. (I think there is a third one we use somewhere, can't remenber what though) ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Re: [Geany-devel] tagmanager changes
Le 08/05/2012 02:03, Lex Trotman a écrit : On 8 May 2012 02:04, Nick Treleaven nick.trelea...@btinternet.com wrote: [...] It doesn't look like tm_file_entry_ is really used. Along with your comment below and about project on the next post, sounds like tm code could be reduced significantly. Might help :) Agreed at 100%! If we could cut down TM to remove the code that's actually not used (or not useful for us) would certainly help a lot to towards making it easier to understand. (BTW I think I remember something about Jiří having done something like it a long time ago) ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Re: [Geany-devel] tagmanager changes
Le 30/04/2012 18:54, Nick Treleaven a écrit : On 29/04/2012 15:42, Colomban Wendling wrote: * it support asynchronous parsing (though not concurrent parsing); What's the difference? Also, what does it buy us? What I meant when saying it's asynchronous but not concurrent is that it supports launching the parsers in a separate thread, but it cannot launch several parsers at once. This is because CTags parsers aren't thread-safe (have a lot of global states and no locks). What asynchronous parsing gives us is quite simple: no blocking. This means that a slow parser (like e.g. the HTML one on Windows before you changed the regex library) wouldn't lock Geany. Same for project plugins that want to parse thousands of files: this would still take a long time, but Geany would be still usable in the meantime. Ok. It seems a good idea to support this, but for parsing tags in open documents it doesn't seem particularly useful, as this ought to be fast. The regex problem was unusual IMO and due to an old version of GNU regex. Right, though a really big file could also be slow to parse. BTW, how does TagManager do fast searches? I always though it did a sorting with new attributes each time they changed, so in such situations it's even worse than O(n), isn't it? For searching, it doesn't do any sorting ever. For adding tags the work object (i.e. document tags) have to be sorted, which I think is good, but also the workspace tags are currently resorted, which I think may be a bad design. If it is never resorted, it means ALL lookups are done on the same criteria: the name. Right? If so, how could scope completion be fast? It requires a lot of different search: 1) name search for finding the type of the element to complete; 2) type search for resolving possible typedefs (recursively); 3) scope search for getting the actual results. Or do I miss something again? - a multi-cache one that, as its name suggests, maintains multiple caches (sorted tags arrays). it uses a little more memory and is slower on insertion since it maintains several sorted lists, but a search on an already existing cache is very fast. Won't this be slow for adding many tags at once? How is this design better than tagmanager? Well, adding a tag would require a bsearch() on each cache, yes. However, adding tags is mostly done in a separate thread (if async parsing it used), so it can be less of a problem. I haven't studied your design, but I would prefer that any design is efficient on all threads, so the user's computer can use remaining performance for compiling whatever else they want. Yeah of course. One possible simple improvement would be to merge only when all tags got parsed, getting something like you did earlier with the patch on TM global tags loading. But yes, it would still require insertions in multiple caches; but I though it was worth it since it provided fast search for all caches (see above for why I though/thinks multiple cache are useful). Also, what about global tags? Those can add a lot of tags all at once. I didn't tried to deal with them yet, but I naively reproduced something like in TM, e.g. another array (cache(s)) for them, so they shouldn't make the whole think much slower. However I'm not certain that having a completely separate array is really good for searching, but again, I just replicated the TM design here. And again, I must admin I didn't actually implemented this for real yet anyway, so I can't really tell. And a search is simply a bsearch() (O(n log n), right?) given the cache already exists. If the cache doesn't exist, it has to be created so yeah on the first search it's slow. If it can be slow enough for the user to notice this is probably bad. As said in another mail, if it's a problem the required caches can be hard-coded so they are created anyway. I doesn't seem a clean thing to me, but that's very well doable. It's not strictly needed, but it makes some memory management easier, and fits well with GTK memory management. And this last point helps a lot to maintain the GtkTreeStore on src/symbols.c, now tags are updated and not removed. But that's not new, I already added this in TM. Yes. Is the reason the tree should be updated and not recreated to preserve fold states and scrollbar position? In fact I'm a bit confused Yes, that's the most prominent reason. It's to: * keep fold state * keep selection * keep scroll position * avoid overall flickering basically it tries to avoid any visible side effects of replacing the tree. about this, how come old tags are still accessed after reparsing the document with new tags? That's the magic of reference counting :) The GtkListStore actually holds a reference to the tag, so they can still be used after TM dropped them. But anyway we always update the symbol list to have a reference to the current TM tag/ BTW, what don't you understand in
Re: [Geany-devel] tagmanager changes
On 12-05-08 05:44 AM, Colomban Wendling wrote: Le 08/05/2012 02:03, Lex Trotman a écrit : On 8 May 2012 02:04, Nick Treleavennick.trelea...@btinternet.com wrote: [...] It doesn't look like tm_file_entry_ is really used. Along with your comment below and about project on the next post, sounds like tm code could be reduced significantly. Might help :) Agreed at 100%! If we could cut down TM to remove the code that's actually not used (or not useful for us) would certainly help a lot to towards making it easier to understand. (BTW I think I remember something about Jiří having done something like it a long time ago) +1000 Also, it wouldn't hurt to make the file system structure and coding standard/style as other Geany files. For example: tagmanager/tm_*.[ch] - delete include/ dir, maybe remove tm_ prefix tagmanager/mio/* tagmanager/ctags/* - all non-tm files here And then we could run the files through GNU Indent or similar program to match Geany's coding style. Cheers, Matthew Brush ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Re: [Geany-devel] tagmanager changes
Le 08/05/2012 14:12, Colomban Wendling a écrit : Le 30/04/2012 19:07, Nick Treleaven a écrit : On 29/04/2012 15:47, Colomban Wendling wrote: Le 26/04/2012 18:53, Nick Treleaven a écrit : On 26/04/2012 16:02, Nick Treleaven wrote: On 24/04/2012 22:31, Colomban Wendling wrote: * it uses the same tag parsers tagmanager used, in ctagsmanager/ctags; BTW this is a good idea to clearly separate CTags from tagmanager. If this change can be applied separately, perhaps it could be merged into master. It should be quite easy -- though it won't still be a vanilla CTags of course, my own isn't either (yet?). I just thought it was a separate change from the TM rewrite. It could very well be I think, basically it only changes the directory structure a little. I'll try to replicate this on TM. Here we go: https://github.com/b4n/geany/commits/tm/tree-refactoring Looks reasonable? The Autotools and Waf build systems should be OK (tested running), but I haven't tested the makefile.win32s; so if somebody could check them it'd be awesome. Cheers, Colomban ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Re: [Geany-devel] gtk_separator_tool_item_new() patch
Hi, Le 29/04/2012 20:26, Dimitar Zhekov a écrit : Hi again, and excuse me for stuffing the list. Actually there is 1/2 error. The plugin toolbar items are inserted improperly, but added to plugin_items list in the right order. So using Customize Toolbar and adding/removing items or otherwise changing them fixes the order. Patch attached. Not in git format, sorry. If I read the thing correctly, the patch is wrong because it would possibly mixup tool items from different plugins if they aren't added at the same time, wouldn't it? But you're right that there is a problem. Currently, it creates: | Plugin_1_Item_2 Plugin_1_Item_1 | Plugin_2_Item_1 | Quit and should create | Plugin_1_Item_1 Plugin_1_Item_2 | Plugin_2_Item_1 | Quit However with your patch, if plugins are added in the order Plugin_1_Item_1, Plugin_2_Item_1, Plugin_1_Item_2, it would give: | Plugin_1_Item_1 | Plugin_2_Item_1 Plugin_1_Item_2 | Quit Which is also wrong (more wrong if I could say). Getting that right seems a bit harder and would need to be able to know what's the last item added by a given plugin. Maybe tagging the widget with the plugin it belongs to, and then search for the first non-matching one would do the trick: def get_insert_position(plugin): pos = toolbar.get_default_insert_pos() if plugin.autosep: # find the last item belonging to @plugin while pos toolbar.get_n_items(): item = toolbar.get_item(pos) if item.get_data(plugin) != plugin: break return pos def add_item(plugin, item): pos = get_insert_pos(plugin) if not plugin.autosep: plugin.autosep = create_sep() toolbar.insert(plugin.autosep, pos) pos += 1 item.set_data(plugin, plugin) toolbar.insert(item, pos) Maintaining an index don't seem really a good idea since it would be one another thing to keep, and I don't think that adding a tool item is a performance-critical thing so the possible overhead finding the position (if there is already an item) should not be a problem. Thoughts? Cheers, Colomban ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Re: [Geany-devel] tagmanager changes
On 8 May 2012 22:31, Colomban Wendling lists@herbesfolles.org wrote: Le 07/05/2012 18:04, Nick Treleaven a écrit : On 02/05/2012 05:46, Lex Trotman wrote: Hi All, To summarise since the thread has several subthreads. 1. Tagmanager Understandability a. I generated the doxygen documentation for tagmanager, it works if you set recursive, but didn't help much: - if its not OOP why does it say things like TMWorkspace is derived from TMWorkObject and similar? documentation bug IMO I don't think so. TM uses a more or less OOP-like approach. See for example TMWorkspace: typedef struct { TMWorkObject work_object; /*! The parent work object */ GPtrArray *global_tags; /*! Global tags loaded at startup */ GPtrArray *work_objects; /*! An array of TMWorkObject pointers */ } TMWorkspace; The first field (work_object) is the inherited class, here TMWorkObject. And you'll see numerous places where the code uses such a derived structure as a TMWorkObject -- since it is one actually --, which looks quite like OOP. Or see tm_workspace.c:44:tm_create_workspace(): it uses tm_work_object_register() to register itself as a new type of work object with a few methods (or vfuncs), and the initializes iself with tm_work_object_init(), etc. I very well understand Lex's questionings about how it does actually work, since it brings a second OOP-style programming in C, less well known than GObject -- though of course less complex also, but still (BTW maybe porting to GObject could help?) Thanks Colomban, that helps :) [...] Cheers Lex ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Re: [Geany-devel] tagmanager changes
Hi All, I did a step back and got some numbers (what! hard evidence in the discussion? :) Using ctags, including locals in the tags generated from Geany source, slightly more than doubled the number of tags, and for some C++ I have around nearly four times the number. For Geany bsearching a sorted array, four times the size is only two more iterations, doesn't matter :) But as Colomban pointed out, with real-time tag generation on, *most* lookups will be from the parser, not Geany. This is needed to see if the name it found is a type so it can tell the statement is a declaration. (c.c does not do this at the moment since it only looks at top level statements and all top level statements are declarations in {} languages). That workload *will* be proportional to the file size. And the number of insertions will also be proportional to the file size. This is a big change from the current situation where parsing occurs rarely. For a simple array structure then, whilst parsing, either the search is linear, or the array is re-sorted on each insert, neither is an attractive prospect for larger files with 4* the number of tags. Also on examining the tags file ctags produced I found that local variable entries did not actually get parsed, ie there was no way to go from a local variable to its type. All that happened was an occurrence of the name was recorded. Also no information about lexical scope is recorded. So even updating to ctags c.c wouldn't give the information on local variables we need to handle the example I posted earlier. So although improving tagmanager/Colomban manager is still worthwhile, significantly more work is needed on the parsers as well. And in terms of handling local variables properly and allowing contextual completion ctags looks like a dead end. Cheers Lex ___ Geany-devel mailing list Geany-devel@uvena.de https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel