Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Matthew Brush

On 12-05-07 08:32 PM, Lex Trotman wrote:

On 8 May 2012 13:27, Matthew Brushmbr...@codebrainz.ca  wrote:

On 12-05-07 05:03 PM, Lex Trotman wrote:


On 8 May 2012 02:04, Nick Treleavennick.trelea...@btinternet.comwrote:


On 02/05/2012 05:46, Lex Trotman wrote:




4. Ctags parsers

Agree with Nick that the parsers are usable, but if we start modifying
them to handle local declarations then they will be totally
incompatible with the Ctags project so I guess it doesn't matter other
than for getting languages we don't currently parse.




ctags c.c already parses local tags



No it doesn't AFAICT:



I'm guessing he's referring to the upstream CTags c.c, which does have a
l kind for local variables (off by default). See `ctags --list-kinds=C`.
I'm not sure if the Geany fork has this, was forked before it was added, or
if the guy that wrote TM took it out.



Ok, I havn't looked at Ctags c.c because IIUC from other comments it
isn't really mergable with our c.c.

Does upstream c.c use tagmanager, and if so how does it structure
scopes?  (A good exercise for your compiler :)



1) no
1a) I never could understand CTags scopes, something like a 2-byte 
value, maybe an index into some array?


Cheers,
Matthew Brush


___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel


Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Colomban Wendling
Le 07/05/2012 18:04, Nick Treleaven a écrit :
 On 02/05/2012 05:46, Lex Trotman wrote:
 Hi All,

 To summarise since the thread has several subthreads.

 1. Tagmanager Understandability

 a. I generated the doxygen documentation for tagmanager, it works if
 you set recursive, but didn't help much:

 - if its not OOP why does it say things like TMWorkspace is derived
 from TMWorkObject and similar?
 
 documentation bug IMO

I don't think so.  TM uses a more or less OOP-like approach.  See for
example TMWorkspace:

typedef struct
{
TMWorkObject work_object; /*! The parent work object */
GPtrArray *global_tags; /*! Global tags loaded at startup */
GPtrArray *work_objects; /*! An array of TMWorkObject pointers */
} TMWorkspace;

The first field (work_object) is the inherited class, here
TMWorkObject.  And you'll see numerous places where the code uses such a
derived structure as a TMWorkObject -- since it is one actually --,
which looks quite like OOP.

Or see tm_workspace.c:44:tm_create_workspace():  it uses
tm_work_object_register() to register itself as a new type of work
object with a few methods (or vfuncs), and the initializes iself with
tm_work_object_init(), etc.

I very well understand Lex's questionings about how it does actually
work, since it brings a second OOP-style programming in C, less well
known than GObject -- though of course less complex also, but still (BTW
maybe porting to GObject could help?)

 - its not clear how it all goes together, the workspace contains
 global tags and work_objects, or is that files and whats the
 
 workspace work objects are document tags. global tags explained in
 geany's manual.
 
 difference between source_file and file_entry?
 
 It doesn't look like tm_file_entry_ is really used.
 

 - similarly whats the difference between symbol and tag?
 
 tm_symbol_ doesn't seem to be used.
 

 2. Ability to expand tagmanager to handle names declared in lexical
 scopes (not to be confused with struct/class scopes).  Here is the
 example again with some numbers so I can refer to them

 { struct a o; struct a p;
  o./* struct a members */
 { struct b o;
   o./* struct b members */
   p./* struct a members */
 }
 o./* struct a members */
 p./* struct a members1  */
 { struct c o;
   o./* struct c members */
   p./* struct a members2  */
 }
 o./* struct a members */
 p./* struct a members */
   }

 a. yep, tries use more memory than an array, the usual speed/space,
 pick one, tradeoff :)

 b. @Nick, when you say sort by scope then name, are you wanting to
 have an entry in the table for each declaration of the name?
 
 no
 

 - If so this makes the array much bigger to search and your search
 speed depends on size, and it doesn't get you anything, you can't
 search by scope since you don't know if the name is declared in the
 scope you are in or an outer scope compare p at1  and2

 - having a single name array which then points to scope info for the
 name is a viable approach (disclosure, thats how I'm doing the symbol
 table for a language I'm developing) but the table being searched is
 usually larger than if you have nested arrays.  Being smaller these
 are faster to search if the search isn't O(1), hence the suggestion of
 trie instead of bsearch.
 
 the gain in simplicity makes a bigger array to search worth it.
 Remember, global tags aren't included in the workspace array of
 tagmanager, so we're not talking a big number of tags, and we have o(log
 n) searching.
 
 
 4. Ctags parsers

 Agree with Nick that the parsers are usable, but if we start modifying
 them to handle local declarations then they will be totally
 incompatible with the Ctags project so I guess it doesn't matter other
 than for getting languages we don't currently parse.
 
 ctags c.c already parses local tags
 

 5. Overloaded symbols

 Since Colombans patch, overloaded symbols are now stable for all
 practical code (I think theoretically it could get confused if the
 overloads are on the same line but thats unlikely enough to ignore for
 human generated code)
 
 If you're talking about master, I think I still experienced wrong
 parenting on reparsing when removing lines.
 
 6. Moving functionality from symbols.c to tagmanager

 a. Since its the 100th anniversary of the Titanic sinking, I think
 that shuffling the deckchairs is an apt analogy, the functionality
 has to be somewhere, its only useful to move it if the destination
 significantly reduces the effort required.
 
 I don't think I suggested moving functionality. I wondered whether TM
 could help make symbols.c less complex. I would need to understand the
 complexity to know whether this is appropriate or not.

Well, what symbols.c tries to do when updating the symbols tree is (as
documented above update_tree_tags() BTW):

1) update tags that still exist and remove obsolescent ones
2) add the remaining tags (new ones) to the tree

The implementation is a (tiny) little (bit) more 

Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Colomban Wendling
Le 02/05/2012 06:46, Lex Trotman a écrit :
 [...]
 
 3. Background/asynchronous whatever you want to call it parsing
 
 a. @Colomban, you say that caches are created on first lookup, doesn't
 that throw work back into the UI thread which could have been done in
 the parsing thread?

Well, yes, the UI thread gets the load (unless there is background/async
lookups too :)).  However in my approach there can be any number of
caches [1], and those caches can't really be guessed, so... but maybe
it's simply a bad approach.  Or the used caches could be hard-coded in
CTM so they never get created implicitly -- quite ugly but works).

[1] cache means:  a array of tags sorted by a given sort function.  Each
cache is then used for a particular type of lookups:
* simple completions (sort by name);
* scope completion (uses more than one cache, sorted by scope and name,
and sorted by type);
* etc. (I think there is a third one we use somewhere, can't remenber
what though)
___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel


Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Colomban Wendling
Le 08/05/2012 02:03, Lex Trotman a écrit :
 On 8 May 2012 02:04, Nick Treleaven nick.trelea...@btinternet.com wrote:
 [...]

 It doesn't look like tm_file_entry_ is really used.
 
 Along with your comment below and about project on the next post,
 sounds like tm code could be reduced significantly.  Might help :)

Agreed at 100%!  If we could cut down TM to remove the code that's
actually not used (or not useful for us) would certainly help a lot to
towards making it easier to understand.  (BTW I think I remember
something about Jiří having done something like it a long time ago)
___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel


Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Colomban Wendling
Le 30/04/2012 18:54, Nick Treleaven a écrit :
 On 29/04/2012 15:42, Colomban Wendling wrote:
 * it support asynchronous parsing (though not concurrent parsing);

 What's the difference? Also, what does it buy us?

 What I meant when saying it's asynchronous but not concurrent is that it
 supports launching the parsers in a separate thread, but it cannot
 launch several parsers at once.  This is because CTags parsers aren't
 thread-safe (have a lot of global states and no locks).

 What asynchronous parsing gives us is quite simple: no blocking.  This
 means that a slow parser (like e.g. the HTML one on Windows before you
 changed the regex library) wouldn't lock Geany.  Same for project
 plugins that want to parse thousands of files:  this would still take a
 long time, but Geany would be still usable in the meantime.
 
 Ok. It seems a good idea to support this, but for parsing tags in open
 documents it doesn't seem particularly useful, as this ought to be fast.
 The regex problem was unusual IMO and due to an old version of GNU regex.

Right, though a really big file could also be slow to parse.

 BTW, how does TagManager do fast searches?  I always though it did a
 sorting with new attributes each time they changed, so in such
 situations it's even worse than O(n), isn't it?
 
 For searching, it doesn't do any sorting ever. For adding tags the work
 object (i.e. document tags) have to be sorted, which I think is good,
 but also the workspace tags are currently resorted, which I think may be
 a bad design.

If it is never resorted, it means ALL lookups are done on the same
criteria: the name.  Right?  If so, how could scope completion be fast?
 It requires a lot of different search:

1) name search for finding the type of the element to complete;
2) type search for resolving possible typedefs (recursively);
3) scope search for getting the actual results.

Or do I miss something again?

 - a multi-cache one that, as its name suggests, maintains
 multiple
   caches (sorted tags arrays).  it uses a little more memory and is
   slower on insertion since it maintains several sorted lists,
 but a
   search on an already existing cache is very fast.

 Won't this be slow for adding many tags at once? How is this design
 better than tagmanager?

 Well, adding a tag would require a bsearch() on each cache, yes.
 However, adding tags is mostly done in a separate thread (if async
 parsing it used), so it can be less of a problem.
 
 I haven't studied your design, but I would prefer that any design is
 efficient on all threads, so the user's computer can use remaining
 performance for compiling  whatever else they want.

Yeah of course.  One possible simple improvement would be to merge only
when all tags got parsed, getting something like you did earlier with
the patch on TM global tags loading.

But yes, it would still require insertions in multiple caches; but I
though it was worth it since it provided fast search for all caches (see
above for why I though/thinks multiple cache are useful).

 Also, what about global tags? Those can add a lot of tags all at once.

I didn't tried to deal with them yet, but I naively reproduced something
like in TM, e.g. another array (cache(s)) for them, so they shouldn't
make the whole think much slower.  However I'm not certain that having a
completely separate array is really good for searching, but again, I
just replicated the TM design here.

And again, I must admin I didn't actually implemented this for real yet
anyway, so I can't really tell.

 And a search is simply a bsearch() (O(n log n), right?) given the cache
 already exists.  If the cache doesn't exist, it has to be created so
 yeah on the first search it's slow.
 
 If it can be slow enough for the user to notice this is probably bad.

As said in another mail, if it's a problem the required caches can be
hard-coded so they are created anyway.  I doesn't seem a clean thing to
me, but that's very well doable.

 It's not strictly needed, but it makes some memory management easier,
 and fits well with GTK memory management.  And this last point helps a
 lot to maintain the GtkTreeStore on src/symbols.c, now tags are updated
 and not removed.
 But that's not new, I already added this in TM.
 
 Yes. Is the reason the tree should be updated and not recreated to
 preserve fold states and scrollbar position? In fact I'm a bit confused

Yes, that's the most prominent reason.  It's to:

* keep fold state
* keep selection
* keep scroll position
* avoid overall flickering

basically it tries to avoid any visible side effects of replacing the tree.

 about this, how come old tags are still accessed after reparsing the
 document with new tags?

That's the magic of reference counting :)  The GtkListStore actually
holds a reference to the tag, so they can still be used after TM dropped
them.  But anyway we always update the symbol list to have a reference
to the current TM tag/

 BTW, what don't you understand in 

Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Matthew Brush

On 12-05-08 05:44 AM, Colomban Wendling wrote:

Le 08/05/2012 02:03, Lex Trotman a écrit :

On 8 May 2012 02:04, Nick Treleavennick.trelea...@btinternet.com  wrote:

[...]

It doesn't look like tm_file_entry_ is really used.


Along with your comment below and about project on the next post,
sounds like tm code could be reduced significantly.  Might help :)


Agreed at 100%!  If we could cut down TM to remove the code that's
actually not used (or not useful for us) would certainly help a lot to
towards making it easier to understand.  (BTW I think I remember
something about Jiří having done something like it a long time ago)


+1000

Also, it wouldn't hurt to make the file system structure and coding 
standard/style as other Geany files. For example:


tagmanager/tm_*.[ch] - delete include/ dir, maybe remove tm_ prefix
tagmanager/mio/*
tagmanager/ctags/* - all non-tm files here

And then we could run the files through GNU Indent or similar program to 
match Geany's coding style.


Cheers,
Matthew Brush
___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel


Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Colomban Wendling
Le 08/05/2012 14:12, Colomban Wendling a écrit :
 Le 30/04/2012 19:07, Nick Treleaven a écrit :
 On 29/04/2012 15:47, Colomban Wendling wrote:
 Le 26/04/2012 18:53, Nick Treleaven a écrit :
 On 26/04/2012 16:02, Nick Treleaven wrote:
 On 24/04/2012 22:31, Colomban Wendling wrote:
 * it uses the same tag parsers tagmanager used, in ctagsmanager/ctags;

 BTW this is a good idea to clearly separate CTags from tagmanager. If
 this change can be applied separately, perhaps it could be merged into
 master.

 It should be quite easy -- though it won't still be a vanilla CTags of
 course, my own isn't either (yet?).

 I just thought it was a separate change from the TM rewrite.
 
 It could very well be I think, basically it only changes the directory
 structure a little.  I'll try to replicate this on TM.

Here we go: https://github.com/b4n/geany/commits/tm/tree-refactoring
Looks reasonable?

The Autotools and Waf build systems should be OK (tested  running), but
I haven't tested the makefile.win32s; so if somebody could check them
it'd be awesome.

Cheers,
Colomban
___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel


Re: [Geany-devel] gtk_separator_tool_item_new() patch

2012-05-08 Thread Colomban Wendling
Hi,

Le 29/04/2012 20:26, Dimitar Zhekov a écrit :
 Hi again, and excuse me for stuffing the list.
 
 Actually there is 1/2 error. The plugin toolbar items are inserted
 improperly, but added to plugin_items list in the right order. So
 using Customize Toolbar and adding/removing items or otherwise
 changing them fixes the order.
 
 Patch attached. Not in git format, sorry.

If I read the thing correctly, the patch is wrong because it would
possibly mixup tool items from different plugins if they aren't added at
the same time, wouldn't it?

But you're right that there is a problem.  Currently, it creates:

| Plugin_1_Item_2 Plugin_1_Item_1 | Plugin_2_Item_1 | Quit

and should create

| Plugin_1_Item_1 Plugin_1_Item_2 | Plugin_2_Item_1 | Quit

However with your patch, if plugins are added in the order
Plugin_1_Item_1, Plugin_2_Item_1, Plugin_1_Item_2, it would give:

| Plugin_1_Item_1 | Plugin_2_Item_1 Plugin_1_Item_2 | Quit

Which is also wrong (more wrong if I could say).


Getting that right seems a bit harder and would need to be able to know
what's the last item added by a given plugin.  Maybe tagging the widget
with the plugin it belongs to, and then search for the first
non-matching one would do the trick:

def get_insert_position(plugin):
pos = toolbar.get_default_insert_pos()

if plugin.autosep:
# find the last item belonging to @plugin
while pos  toolbar.get_n_items():
item = toolbar.get_item(pos)
if item.get_data(plugin) != plugin:
break

return pos

def add_item(plugin, item):
pos = get_insert_pos(plugin)

if not plugin.autosep:
plugin.autosep = create_sep()
toolbar.insert(plugin.autosep, pos)
pos += 1

item.set_data(plugin, plugin)
toolbar.insert(item, pos)

Maintaining an index don't seem really a good idea since it would be one
another thing to keep, and I don't think that adding a tool item is a
performance-critical thing so the possible overhead finding the position
(if there is already an item) should not be a problem.

Thoughts?

Cheers,
Colomban
___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel


Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Lex Trotman
On 8 May 2012 22:31, Colomban Wendling lists@herbesfolles.org wrote:
 Le 07/05/2012 18:04, Nick Treleaven a écrit :
 On 02/05/2012 05:46, Lex Trotman wrote:
 Hi All,

 To summarise since the thread has several subthreads.

 1. Tagmanager Understandability

 a. I generated the doxygen documentation for tagmanager, it works if
 you set recursive, but didn't help much:

 - if its not OOP why does it say things like TMWorkspace is derived
 from TMWorkObject and similar?

 documentation bug IMO

 I don't think so.  TM uses a more or less OOP-like approach.  See for
 example TMWorkspace:

 typedef struct
 {
    TMWorkObject work_object; /*! The parent work object */
    GPtrArray *global_tags; /*! Global tags loaded at startup */
    GPtrArray *work_objects; /*! An array of TMWorkObject pointers */
 } TMWorkspace;

 The first field (work_object) is the inherited class, here
 TMWorkObject.  And you'll see numerous places where the code uses such a
 derived structure as a TMWorkObject -- since it is one actually --,
 which looks quite like OOP.

 Or see tm_workspace.c:44:tm_create_workspace():  it uses
 tm_work_object_register() to register itself as a new type of work
 object with a few methods (or vfuncs), and the initializes iself with
 tm_work_object_init(), etc.

 I very well understand Lex's questionings about how it does actually
 work, since it brings a second OOP-style programming in C, less well
 known than GObject -- though of course less complex also, but still (BTW
 maybe porting to GObject could help?)

Thanks Colomban, that helps :)

[...]

Cheers
Lex
___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel


Re: [Geany-devel] tagmanager changes

2012-05-08 Thread Lex Trotman
Hi All,

I did a step back and got some numbers (what! hard evidence in the
discussion? :)

Using ctags, including locals in the tags generated from Geany source,
slightly more than doubled the number of tags, and for some C++ I have
around nearly four times the number.

For Geany bsearching a sorted array, four times the size is only two
more iterations, doesn't matter :)

But as Colomban pointed out, with real-time tag generation on, *most*
lookups will be from the parser, not Geany.  This is needed to see if
the name it found is a type so it can tell the statement is a
declaration. (c.c does not do this at the moment since it only looks
at top level statements and all top level statements are declarations
in {} languages).

That workload *will* be proportional to the file size.  And the number
of insertions will also be proportional to the file size.  This is a
big change from the current situation where parsing occurs rarely.

For a simple array structure then, whilst parsing, either the search
is linear, or the array is re-sorted on each insert, neither is an
attractive prospect for larger files with 4* the number of tags.

Also on examining the tags file ctags produced I found that local
variable entries did not actually get parsed, ie there was no way to
go from a local variable to its type.  All that happened was an
occurrence of the name was recorded.  Also no information about
lexical scope is recorded.  So even updating to ctags c.c wouldn't
give the information on local variables we need to handle the example
I posted earlier.

So although improving tagmanager/Colomban manager is still worthwhile,
significantly more work is needed on the parsers as well.

And in terms of handling local variables properly and allowing
contextual completion ctags looks like a dead end.

Cheers
Lex
___
Geany-devel mailing list
Geany-devel@uvena.de
https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel