Re: Tony.RE: GNU Global Parsing Suffixless Files Patch

2016-10-07 Thread Shigio YAMAGUCHI
> Yes that sounds a lot nicer. Also if one gave a relative path to the
script
> then this could be taken from where the GPATH database files are found.
> Then that would mean you could have a config being used by a number of
> different projects and also would work when you do a global -u as that
> needs to find the databases from anywhere in the project and it might
> as well use that same location for the hook script. Thus a generic config
> along with specific hook scripts and databases per project.

You are right.

> In reference to:
> 1 - Existing langmap style extension list e.g. `.c.h'.
> 2 - File only glob pattern e.g. `([Mm]akefile)'.
> 3 - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
> 4  - A dumb path substring match (possibly with the caveat that
it must
> start with ./ or / to distinguish it from the above?) e.g. '/include/'.
> 5 - A bare name of a file containing a list of filenames  e.g.
`cppfiles'?
> What were you thinking of supporting in --force-language then?

I think 5 is enough.
The file list is suitable for dealing with those with no rules.
We can write rule-based descriptions (1, 2, and 3) to the langmap.

> By the way I'm going away on a source code analysis and comprehension
workshop
> for maybe up to two weeks and so my access to my business email may well
be
> limited, so please don't think I'm ignoring you (although you'll probably
be
> glad of the peace and quiet! :-)).

OK. Thank you for the constructive discussion!

Regards,
Shigio


2016-10-08 4:05 GMT+09:00 Cooper, Anthony :

> SECURITY CLASSIFICATION: OFFICIAL
>
>
> > -Original Message-
> > From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of
> > Shigio YAMAGUCHI
> > Sent: 06 October 2016 23:56
> > To: Cooper, Anthony
> > Cc: bug-global@gnu.org
> > Subject: Re: Tony.RE: GNU Global Parsing Suffixless Files Patch
> >
> > > True, I was thinking of when it's integrated as a part of an
> > > editor/IDE and gets called directly, like emacs, on writing a file
> > > and browsing multiple projects with there own global databases.
> > > Again you can add something to the path at the start with your shell
> > > script... Anyway
> > in reference to:
> >
> > OK. How about adding a hook like this?
> >
> > [gtags.conf]
> > :gtags-hook=find . > cppfiles:
>
> Yes that sounds a lot nicer. Also if one gave a relative path to the
> script then this could be taken from where the GPATH database files are
> found. Then that would mean you could have a config being used by a number
> of different projects and also would work when you do a global -u as that
> needs to find the databases from anywhere in the project and it might as
> well use that same location for the hook script. Thus a generic config
> along with specific hook scripts and databases per project.
>
> In reference to:
> 1 - Existing langmap style extension list e.g. `.c.h'.
> 2 - File only glob pattern e.g. `([Mm]akefile)'.
> 3 - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
> 4  - A dumb path substring match (possibly with the caveat that it
> must start with ./ or / to distinguish it from the above?) e.g. '/include/'.
> 5 - A bare name of a file containing a list of filenames  e.g.
> `cppfiles'?
> What were you thinking of supporting in --force-language then?
>
> By the way I'm going away on a source code analysis and comprehension
> workshop for maybe up to two weeks and so my access to my business email
> may well be limited, so please don't think I'm ignoring you (although
> you'll probably be glad of the peace and quiet! :-)).
>
> Regards,
>
> Tony.
>
> 
> 
> Communications with GCHQ may be monitored and/or recorded
> for system efficiency and other lawful purposes. Any views or
> opinions expressed in this e-mail do not necessarily reflect GCHQ
> policy.  This email, and any attachments, is intended for the
> attention of the addressee(s) only. Its unauthorised use,
> disclosure, storage or copying is not permitted.  If you are not the
> intended recipient, please notify postmas...@gchq.gsi.gov.uk.
>
> This information is exempt from disclosure under the Freedom of
> Information Act 2000 and may be subject to exemption under
> other UK information legislation. Refer disclosure requests to
> GCHQ on 01242 221491 ext 30306 (non-secure) or email
> info...@gchq.gsi.gov.uk
>
> 
> 
>
>


-- 
Shigio YAMAGUCHI 
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3
___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


RE: Tony.RE: GNU Global Parsing Suffixless Files Patch

2016-10-07 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 


> -Original Message-
> From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of 
> Shigio YAMAGUCHI
> Sent: 06 October 2016 23:56
> To: Cooper, Anthony
> Cc: bug-global@gnu.org
> Subject: Re: Tony.RE: GNU Global Parsing Suffixless Files Patch
> 
> > True, I was thinking of when it's integrated as a part of an 
> > editor/IDE and gets called directly, like emacs, on writing a file 
> > and browsing multiple projects with there own global databases. 
> > Again you can add something to the path at the start with your shell 
> > script... Anyway
> in reference to:
> 
> OK. How about adding a hook like this?
> 
> [gtags.conf]
> :gtags-hook=find . > cppfiles:

Yes that sounds a lot nicer. Also if one gave a relative path to the script 
then this could be taken from where the GPATH database files are found. Then 
that would mean you could have a config being used by a number of different 
projects and also would work when you do a global -u as that needs to find the 
databases from anywhere in the project and it might as well use that same 
location for the hook script. Thus a generic config along with specific hook 
scripts and databases per project.

In reference to:
1 - Existing langmap style extension list e.g. `.c.h'.
2 - File only glob pattern e.g. `([Mm]akefile)'.
3 - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
4  - A dumb path substring match (possibly with the caveat that it must 
start with ./ or / to distinguish it from the above?) e.g. '/include/'.
5 - A bare name of a file containing a list of filenames  e.g. 
`cppfiles'?
What were you thinking of supporting in --force-language then?

By the way I'm going away on a source code analysis and comprehension workshop 
for maybe up to two weeks and so my access to my business email may well be 
limited, so please don't think I'm ignoring you (although you'll probably be 
glad of the peace and quiet! :-)).

Regards,

Tony.


Communications with GCHQ may be monitored and/or recorded 
for system efficiency and other lawful purposes. Any views or 
opinions expressed in this e-mail do not necessarily reflect GCHQ 
policy.  This email, and any attachments, is intended for the 
attention of the addressee(s) only. Its unauthorised use, 
disclosure, storage or copying is not permitted.  If you are not the
intended recipient, please notify postmas...@gchq.gsi.gov.uk.  

This information is exempt from disclosure under the Freedom of 
Information Act 2000 and may be subject to exemption under
other UK information legislation. Refer disclosure requests to 
GCHQ on 01242 221491 ext 30306 (non-secure) or email
info...@gchq.gsi.gov.uk



___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


Re: Tony.RE: GNU Global Parsing Suffixless Files Patch

2016-10-06 Thread Shigio YAMAGUCHI
> True, I was thinking of when it's integrated as a part of an editor/IDE
and gets
> called directly, like emacs, on writing a file and browsing multiple
projects
> with there own global databases. Again you can add something to the path
at the
> start with your shell script... Anyway in reference to:

OK. How about adding a hook like this?

[gtags.conf]
:gtags-hook=find . > cppfiles:

When 'gtags-hook' is used, gtags(1) executes the hook before its job.
Of course, you can use shell script instead of find(1).
This is similar to emacs's 'add-hook'.

Regards,
Shigio


2016-10-07 3:28 GMT+09:00 Cooper, Anthony :

> SECURITY CLASSIFICATION: OFFICIAL
>
>
>
>
> > -Original Message-
> > From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of
> > Shigio YAMAGUCHI
> > Sent: 06 October 2016 07:10
> > To: Cooper, Anthony
> > Cc: bug-global@gnu.org
> > Subject: Re: Tony.RE: GNU Global Parsing Suffixless Files Patch
> >
> > > At one stage I thought of extending the gtags file format to include
> > > an optional language override, it's similar to your file list idea...
> > > However as I used global more I started to shy away from that as
> > > it's high maintenance and would break automatic recursive update on
> > > file
> > addition.
> > >
> > > For example: If you're working on a project that has non-standard
> > > file naming conventions and/or has particular type types in odd
> > > places (like my texi/inc example) then if you used a file list/type
> > > approach you'd need to update that each time you added another
> > > suffixless header
> > file.
> >
> > The can not be automated is a misunderstanding. You can automate it
> > just to write the following script and use instead of 'global -u'.
> >
> > [global-u.sh]
> > +---
> --
> > |#!/bin/sh
> > |root=`global -pr` && cd $root  # Move to the
> project root
> > |if [ $? = 0 ]; then
> > |   find . > cppfiles   # Make cppfiles
> > |   gtags -i --force-language=cpp:cppfiles
> > |fi
> >
> > +-
> >
>
> True, I was thinking of when it's integrated as a part of an editor/IDE
> and gets called directly, like emacs, on writing a file and browsing
> multiple projects with there own global databases. Again you can add
> something to the path at the start with your shell script... Anyway in
> reference to:
> 1 - Existing langmap style extension list e.g. `.c.h'.
> 2 - File only glob pattern e.g. `([Mm]akefile)'.
> 3 - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
> 4  - A dumb path substring match (possibly with the caveat that it
> must start with ./ or / to distinguish it from the above?) e.g. '/include/'.
> 5 - A bare name of a file containing a list of filenames  e.g.
> `cppfiles'?
> What were you thinking of supporting in --force-language then?
>
> > It was merged to Universal Ctags. But there is no parser which use the
> > mechanism yet.
> > (See makeSimpleRefTag in main/parse.c)
>
> Are ok... Mind you they haven't had a release for a long time as far as I
> can tell. I thought I had got an old site at first.
>
> Regards,
>
> Tony.
> 
>
> 
> 
> Communications with GCHQ may be monitored and/or recorded
> for system efficiency and other lawful purposes. Any views or
> opinions expressed in this e-mail do not necessarily reflect GCHQ
> policy.  This email, and any attachments, is intended for the
> attention of the addressee(s) only. Its unauthorised use,
> disclosure, storage or copying is not permitted.  If you are not the
> intended recipient, please notify postmas...@gchq.gsi.gov.uk.
>
> This information is exempt from disclosure under the Freedom of
> Information Act 2000 and may be subject to exemption under
> other UK information legislation. Refer disclosure requests to
> GCHQ on 01242 221491 ext 30306 (non-secure) or email
> info...@gchq.gsi.gov.uk
>
> 
> 
>
>


-- 
Shigio YAMAGUCHI 
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3
___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


RE: Tony.RE: GNU Global Parsing Suffixless Files Patch

2016-10-06 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 




> -Original Message-
> From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of 
> Shigio YAMAGUCHI
> Sent: 06 October 2016 07:10
> To: Cooper, Anthony
> Cc: bug-global@gnu.org
> Subject: Re: Tony.RE: GNU Global Parsing Suffixless Files Patch
> 
> > At one stage I thought of extending the gtags file format to include 
> > an optional language override, it's similar to your file list idea...
> > However as I used global more I started to shy away from that as 
> > it's high maintenance and would break automatic recursive update on 
> > file
> addition.
> >
> > For example: If you're working on a project that has non-standard 
> > file naming conventions and/or has particular type types in odd 
> > places (like my texi/inc example) then if you used a file list/type 
> > approach you'd need to update that each time you added another 
> > suffixless header
> file.
> 
> The can not be automated is a misunderstanding. You can automate it 
> just to write the following script and use instead of 'global -u'.
> 
> [global-u.sh]
> +-
> |#!/bin/sh
> |root=`global -pr` && cd $root  # Move to the project 
> root
> |if [ $? = 0 ]; then
> |   find . > cppfiles   # Make cppfiles
> |   gtags -i --force-language=cpp:cppfiles
> |fi
> 
> +-
> 

True, I was thinking of when it's integrated as a part of an editor/IDE and 
gets called directly, like emacs, on writing a file and browsing multiple 
projects with there own global databases. Again you can add something to the 
path at the start with your shell script... Anyway in reference to:
1 - Existing langmap style extension list e.g. `.c.h'.
2 - File only glob pattern e.g. `([Mm]akefile)'.
3 - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
4  - A dumb path substring match (possibly with the caveat that it must 
start with ./ or / to distinguish it from the above?) e.g. '/include/'.
5 - A bare name of a file containing a list of filenames  e.g. 
`cppfiles'?
What were you thinking of supporting in --force-language then?

> It was merged to Universal Ctags. But there is no parser which use the 
> mechanism yet.
> (See makeSimpleRefTag in main/parse.c)

Are ok... Mind you they haven't had a release for a long time as far as I can 
tell. I thought I had got an old site at first.

Regards,

Tony.



Communications with GCHQ may be monitored and/or recorded 
for system efficiency and other lawful purposes. Any views or 
opinions expressed in this e-mail do not necessarily reflect GCHQ 
policy.  This email, and any attachments, is intended for the 
attention of the addressee(s) only. Its unauthorised use, 
disclosure, storage or copying is not permitted.  If you are not the
intended recipient, please notify postmas...@gchq.gsi.gov.uk.  

This information is exempt from disclosure under the Freedom of 
Information Act 2000 and may be subject to exemption under
other UK information legislation. Refer disclosure requests to 
GCHQ on 01242 221491 ext 30306 (non-secure) or email
info...@gchq.gsi.gov.uk



___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


Re: Tony.RE: GNU Global Parsing Suffixless Files Patch

2016-10-05 Thread Shigio YAMAGUCHI
> At one stage I thought of extending the gtags file format to include
> an optional language override, it's similar to your file list idea...
> However as I used global more I started to shy away from that as it's
> high maintenance and would break automatic recursive update on file
addition.
>
> For example: If you're working on a project that has non-standard file
> naming conventions and/or has particular type types in odd places (like
> my texi/inc example) then if you used a file list/type approach you'd
> need to update that each time you added another suffixless header file.

The can not be automated is a misunderstanding. You can automate it
just to write the following script and use instead of 'global -u'.

[global-u.sh]
+-
|#!/bin/sh
|root=`global -pr` && cd $root  # Move to the
project root
|if [ $? = 0 ]; then
|   find . > cppfiles   # Make cppfiles
|   gtags -i --force-language=cpp:cppfiles
|fi
+-

Global(1) is available in a shell script.

> So as I understand it we would have
--language-force=:
> where  would be one of:
>
> *x- Existing langmap style extension list e.g. `.c.h'.
> *x- File only glob pattern e.g. `([Mm]akefile)'.
> *x - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
>  x - A dumb path substring match (possibly with the caveat that
>  it must start with ./ or / to distinguish it from the above?)
>  e.g. '/include/'.
>  ? - A bare name of a file list in the config e.g. `cppfiles'?
>
> Those entries marked with * would also apply to langmap config entries as
well.
> Those entries marked with x meet my requirements/wishlist.
>
> With those additional features marked with x and your proposed priority
> list as detailed yesterday I would say that would give maximum benefit
> without too much extra cost (famous last words!).
>
> One additional feature/thought is that one could have a language type of
auto
> that would mean do normal file type detection. Thus the above example
would read:
>
>  --force-language=cpp:include --force-language=auto:.c \
> --force-language=auto:([Mm]akefile)

To hear your explanation, I though that we should take the file list again.
It is suitable for dealing with those with no rules like C++ include file.
Otherwise, the specification becomes too large.

Though the file list has also bad points, there is also a big good point
that
you can entrust the function of selecting files external programs (or
humans).
About for 'selecting files' there is no program other than find(1).

> Do you have that patch for ctags to give out references? And if applied
could
> gtags make use of them? If so would you be happy to send that to me?

It was merged to Universal Ctags. But there is no parser which use the
mechanism yet.
(See makeSimpleRefTag in main/parse.c)

Regards,
Shigio


2016-10-06 0:28 GMT+09:00 Cooper, Anthony :

> SECURITY CLASSIFICATION: OFFICIAL
>
>
>
>
> > -Original Message-
> > From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of
> > Shigio YAMAGUCHI
> > Sent: 05 October 2016 02:56
> > To: Cooper, Anthony
> > Cc: bug-global@gnu.org
> > Subject: Re: GNU Global Parsing Suffixless Files Patch
> >
> > > Q: I'm assuming any glob patterns would implicitly be anchored to
> > > the end of the path string (as they are in bash)?
> >
> > Yes. In ctags, '()' matches to file names not path names, like
> '.c.h'.
>
> :-)
>
> >
> > > Yes I know... In fact after originally looking at global and ctags I
> > > thought how potentially dangerous ctags's --force-language option
> > > was and that's why I called my extension suffixless_langmap.
> > > My intention was  that this option wouldn't force anything but
> > > instead provide a default language when there wasn't a file suffix.
> > >
> > > For example, in project include directories you quite often get
> > > other
> >
> > > artefacts like .c, .texi, .html (I know that these get excluded) and
> > > .inc files (MSVS). If the --force-language override option is used
> > > on those include directories then files with a suffix don't
> > > automatically get handled the way they should. Instead you'd
> > > possibly have to put in additional more specific --force-language
> > > overrides to reinstate defau

Tony.RE: GNU Global Parsing Suffixless Files Patch

2016-10-05 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL




> -Original Message-
> From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of
> Shigio YAMAGUCHI
> Sent: 05 October 2016 02:56
> To: Cooper, Anthony
> Cc: bug-global@gnu.org
> Subject: Re: GNU Global Parsing Suffixless Files Patch
>
> > Q: I'm assuming any glob patterns would implicitly be anchored to
> > the end of the path string (as they are in bash)?
>
> Yes. In ctags, '()' matches to file names not path names, like 
> '.c.h'.

:-)

>
> > Yes I know... In fact after originally looking at global and ctags I
> > thought how potentially dangerous ctags's --force-language option
> > was and that's why I called my extension suffixless_langmap.
> > My intention was  that this option wouldn't force anything but
> > instead provide a default language when there wasn't a file suffix.
> >
> > For example, in project include directories you quite often get
> > other
>
> > artefacts like .c, .texi, .html (I know that these get excluded) and
> > .inc files (MSVS). If the --force-language override option is used
> > on those include directories then files with a suffix don't
> > automatically get handled the way they should. Instead you'd
> > possibly have to put in additional more specific --force-language
> > overrides to reinstate default behaviour for certain extensions. E.g.:
>
> You are right. It is a important point. You should be able to finely control.
>
> How about using a 'file list' instead of a direct path.
>
> --language-force=:
>
> File list is a file which lists file names.
>
> e.g.
> [cppfiles]
> +-
> |include/c++/4.8/algorithm
> |include/c++/4.8/bits/stl_algo.h
> |include/c++/5.1/algorithm
>
> $ gtags --language-force=cpp:cppfiles
>
> You can use find(1) command to make a file list.
> This will satisfy your request too, because find(1) has both glob and
> regex. :)
>
> New priority:
> [high]
> 1. --language-force=: 2. langmap=: glob pattern list> [low]
>
> What do you think?

An interesting idea :-). Upon reflection I'm actually quite happy with what you 
proposed yesterday - sorry perhaps I should have been clearer at the time...

At one stage I thought of extending the gtags file format to include an 
optional language override, it's similar to your file list idea... However as I 
used global more I started to shy away from that as it's high maintenance and 
would break automatic recursive update on file addition.

For example: If you're working on a project that has non-standard file naming 
conventions and/or has particular type types in odd places (like my texi/inc 
example) then if you used a file list/type approach you'd need to update that 
each time you added another suffixless header file. However with your 
path/specific glob approach and priority scheme(let's call this prio-path-glob):

--force-language=cpp:include --force-language=c:.c 
--force-language=makefile:([Mm]akefile) ...

This does the job quite nicely. You wouldn't need to update any config unless 
there was a new file type that needed to be excluded (unlikely within an 
existing project). You could just run global -u and update as normal. If given 
the file list feature I would avoid using it because of the need to maintain 
it. A couple of the really cool things about gtags is you just type gtags and 
it does it all for you (unless you have non-standard stuff) and global -u picks 
up updates and new files.

The only `upsides' my `explicitly select the overridden files with RE' approach 
has over yours was:
1) RE patterns are more powerful and succinct - would deal with cases 
we haven't thought of.
2) You're explicitly selecting what you want to override.

So 1 is overkill as agreed (the prio-path-glob approach will meet all the 
requirements we can think of) so that's gone; and as for 2 if prio-path-glob 
were used instead you'd probably only need to have a couple of file type 
override directives in there anyway, as the skip list will weed out most 
exceptions anyway. So upon reflection I feel that a file type list would add 
extra complexity that isn't needed. If you have a specific requirement for it 
yourself then could we have it in addition to what you proposed yesterday 
please?

So as I understand it we would have --language-force=: 
where  would be one of:
*x- Existing langmap style extension list e.g. `.c.h'.
*x- File only glob pattern e.g. `([Mm]akefile)'.
*x - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
  x - A dumb path substring match (possibly with the caveat that it 
must start with ./ or

Re: GNU Global Parsing Suffixless Files Patch

2016-10-04 Thread Shigio YAMAGUCHI
> Q: I'm assuming any glob patterns would implicitly be anchored to the end
of
> the path string (as they are in bash)?

Yes. In ctags, '()' matches to file names not path names, like
'.c.h'.

> Yes I know... In fact after originally looking at global and ctags
> I thought how potentially dangerous ctags's --force-language option
> was and that's why I called my extension suffixless_langmap.
> My intention was  that this option wouldn't force anything but instead
> provide a default language when there wasn't a file suffix.
>
> For example, in project include directories you quite often get other
> artefacts like .c, .texi, .html (I know that these get excluded) and
> .inc files (MSVS). If the --force-language override option is used on
> those include directories then files with a suffix don't automatically
> get handled the way they should. Instead you'd possibly have to put in
> additional more specific --force-language overrides to reinstate default
> behaviour for certain extensions. E.g.:

You are right. It is a important point. You should be able to finely
control.

How about using a 'file list' instead of a direct path.

--language-force=:

File list is a file which lists file names.

e.g.
[cppfiles]
+-
|include/c++/4.8/algorithm
|include/c++/4.8/bits/stl_algo.h
|include/c++/5.1/algorithm

$ gtags --language-force=cpp:cppfiles

You can use find(1) command to make a file list.
This will satisfy your request too, because find(1) has both glob and
regex. :)

New priority:
[high]
1. --language-force=:
2. langmap=:
[low]

What do you think?

> If/when someone comes to work on this, my patch is probably still worth
> a look as 70-80% of it is done with respect to the proposal above.
> Either way some of it may be of use.

Thank you so much.

Regards,
Shigio


2016-10-05 4:09 GMT+09:00 Cooper, Anthony :

> SECURITY CLASSIFICATION: OFFICIAL
>
>
> Good morning :-)
>
> > -Original Message-
> > From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of
> > Shigio YAMAGUCHI
> > Sent: 04 October 2016 01:19
> > To: Cooper, Anthony
> > Cc: bug-global@gnu.org
> > Subject: Re: GNU Global Parsing Suffixless Files Patch
> >
> > Good morning :)
> > I understood regex version of --language-force is very powerful.
> > However, it seems too powerful for us to manage it completely.
> >
> > How about releasing the real path version and '()' syntax first?
> > It's simple and easy to understand, and is similar to ctags.
> > At the stage now, no one can judge whether regex version is needed,
> > because no one has used even the real path version.
> >
> > >E.g. If I had:
> > >Default: \
> > >:GTAGS_OPTIONS=--force-language=yacc\:(sys\$): \
> > >   --force-language='cpp\:(^\\./Microsoft
> Visual)':
> > >
> > > Then this would say match all files ending in sys and treat them as
> > > yacc and any suffixless files with a path starting with `./Microsoft
> > > Visual' are to be treated as cpp files.
> >
> > Using the real path version and '()' syntax, that is realized easily
> like this:
> >
> > [gtags.conf]
> > :langmap=yacc\:(*sys):
> >
> > $ gtags --force-language='yacc:Microsoft Visual'
> >
>
> A very minor point: the `Microsoft Visual' examples are different as my RE
> only matches at the head of the path.
>
> I guess I get nervous putting in more limited matching mechanisms inside
> an option that is designed to override the normal default/sane behaviour; I
> would like to be as precise as possible in my overrides. Also most would
> use the simple substring match, but regex's are there for edge cases that
> we haven't thought of. Most devs are comfortable with REs.
>
> Q: I'm assuming any glob patterns would implicitly be anchored to the end
> of the path string (as they are in bash)?
>
> > > One thing to note, made in the man page and help text, is this
> > > switch won't affect any files with a suffux, which some people might
> > > expect with `force' in the name of the switch.
> >
> > In ctags, --language-force option ignores suffixes. I'd like to follow
> > ctags method.
>
> Yes I know... In fact after originally looking at global and ctags I
> thought how potentially dangerous ctags's --force-language option was and
> that's why I called my extension suffixless_langmap. My intention was  that
> this option wouldn't force anything but instead provide a

RE: GNU Global Parsing Suffixless Files Patch

2016-10-04 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 


Good morning :-)

> -Original Message-
> From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of 
> Shigio YAMAGUCHI
> Sent: 04 October 2016 01:19
> To: Cooper, Anthony
> Cc: bug-global@gnu.org
> Subject: Re: GNU Global Parsing Suffixless Files Patch
> 
> Good morning :)
> I understood regex version of --language-force is very powerful.
> However, it seems too powerful for us to manage it completely.
> 
> How about releasing the real path version and '()' syntax first?
> It's simple and easy to understand, and is similar to ctags.
> At the stage now, no one can judge whether regex version is needed, 
> because no one has used even the real path version.
> 
> >E.g. If I had:
> >Default: \
> >:GTAGS_OPTIONS=--force-language=yacc\:(sys\$): \
> >   --force-language='cpp\:(^\\./Microsoft 
> > Visual)':
> >
> > Then this would say match all files ending in sys and treat them as 
> > yacc and any suffixless files with a path starting with `./Microsoft 
> > Visual' are to be treated as cpp files.
> 
> Using the real path version and '()' syntax, that is realized easily like 
> this:
> 
> [gtags.conf]
> :langmap=yacc\:(*sys):
> 
> $ gtags --force-language='yacc:Microsoft Visual'
> 

A very minor point: the `Microsoft Visual' examples are different as my RE only 
matches at the head of the path.

I guess I get nervous putting in more limited matching mechanisms inside an 
option that is designed to override the normal default/sane behaviour; I would 
like to be as precise as possible in my overrides. Also most would use the 
simple substring match, but regex's are there for edge cases that we haven't 
thought of. Most devs are comfortable with REs.

Q: I'm assuming any glob patterns would implicitly be anchored to the end of 
the path string (as they are in bash)?

> > One thing to note, made in the man page and help text, is this 
> > switch won't affect any files with a suffux, which some people might 
> > expect with `force' in the name of the switch.
> 
> In ctags, --language-force option ignores suffixes. I'd like to follow 
> ctags method.

Yes I know... In fact after originally looking at global and ctags I thought 
how potentially dangerous ctags's --force-language option was and that's why I 
called my extension suffixless_langmap. My intention was  that this option 
wouldn't force anything but instead provide a default language when there 
wasn't a file suffix.

For example, in project include directories you quite often get other artefacts 
like .c, .texi, .html (I know that these get excluded) and .inc files (MSVS). 
If the --force-language override option is used on those include directories 
then files with a suffix don't automatically get handled the way they should. 
Instead you'd possibly have to put in additional more specific --force-language 
overrides to reinstate default behaviour for certain extensions. E.g.:

--force-language=cpp:include --force-language=c:.c 
--force-language=makefile:([Mm]akefile) ...
 
However with REs you could be more selective in your initial --force-language 
setting and avoid the subsequent detailed extension overrides.

--force-language='cpp:(/include/(.*/)*[^/.]?$)'

In a glob pattern as far as I'm aware there's no way of saying `select files 
not containing a period' :-(.

> 
> $ ctags --language-force=c test.php # test.php is treated as C source 
> file
> 
> How about setting the following priority?
> (This --language-force is the real path version)
> 
> [high]
> 1. --language-force=:
> 2. --language-force=:
> 3. langmap=: [low]
> 
> e.g.
> [gtags.conf]
> :langmap=c\:.x([Mm]ake):
> 
> $ gtags --language-force=perl:dir1 --language-force=php:php.x
> 
> ./
>  |-dir1/
>  |  |-test.x=> perl by --language-force=perl:dir1
>  |  |-Make  => perl by --language-force=perl:dir1
>  |  |-php.x => php by --language-force=php:php.x
>  |-dir2
> |-test.x=> c by langmap=c\:.x([Mm]ake):
> |-Make => c by langmap=c\:.x([Mm]ake):
> 

The priorities look fine to me.

Whilst I think it's a _bit_ of a pity not to have REs for the reasons pointed 
out above, none of the issues are insurmountable with a glob implementation, 
just possibly less obvious? But more consistent as you say with ctags. So as 
you say start off with globs and see :-).

Many thanks for being so helpful and constructive, it is appreciated as is 
Global.

If/when someone comes to work on this, my patch is probably still worth a look 
as 70-80% of it is d

Re: GNU Global Parsing Suffixless Files Patch

2016-10-03 Thread Shigio YAMAGUCHI
Good morning :)
I understood regex version of --language-force is very powerful.
However, it seems too powerful for us to manage it completely.

How about releasing the real path version and '()' syntax first?
It's simple and easy to understand, and is similar to ctags.
At the stage now, no one can judge whether regex version is needed,
because no one has used even the real path version.

>E.g. If I had:
>Default: \
>:GTAGS_OPTIONS=--force-language=yacc\:(sys\$): \
>   --force-language='cpp\:(^\\./Microsoft
Visual)':
>
> Then this would say match all files ending in sys and treat them as yacc
and
> any suffixless files with a path starting with `./Microsoft Visual' are
to be
> treated as cpp files.

Using the real path version and '()' syntax, that is realized easily like
this:

[gtags.conf]
:langmap=yacc\:(*sys):

$ gtags --force-language='yacc:Microsoft Visual'

> One thing to note, made in the man page and help text, is this switch
won't affect
> any files with a suffux, which some people might expect with `force' in
the name of
> the switch.

In ctags, --language-force option ignores suffixes. I'd like to follow
ctags method.

$ ctags --language-force=c test.php # test.php is treated as C source file

How about setting the following priority?
(This --language-force is the real path version)

[high]
1. --language-force=:
2. --language-force=:
3. langmap=:
[low]

e.g.
[gtags.conf]
:langmap=c\:.x([Mm]ake):

$ gtags --language-force=perl:dir1 --language-force=php:php.x

./
 |-dir1/
 |  |-test.x=> perl by --language-force=perl:dir1
 |  |-Make  => perl by --language-force=perl:dir1
 |  |-php.x => php by --language-force=php:php.x
 |-dir2
|-test.x=> c by langmap=c\:.x([Mm]ake):
|-Make => c by langmap=c\:.x([Mm]ake):

> Did you correctly receive the new patch for 6.5.5?

Sorry but I did not read that at all. I would like to discuss about
the specification not about the implementation.

Regards,
Shigio


2016-10-03 21:34 GMT+09:00 Cooper, Anthony :

> SECURITY CLASSIFICATION: OFFICIAL
>
>
> Good morning :-) (See comments below)
>
> > -Original Message-
> > From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of
> > Shigio YAMAGUCHI
> > Sent: 01 October 2016 00:17
> > To: Cooper, Anthony
> > Cc: bug-global@gnu.org
> > Subject: Re: GNU Global Parsing Suffixless Files Patch
> >
> > Before implementation, I would like to make clear the specification.
> >
> > > Assorted projects I've come across have include and Include (the
> > > example below is a trivial but a real one relating to MS-Windows)
> > > and some even have include dirs names XInclude or something similar
> > > (can't remember the project now, wasn't X11 but probably an X client).
> >
> > Let me ask a couple of questions, please.
> >
> >
> > Q1: Is the following (1) and (2) equal?
> >
> > (1) --language-force='cpp:([Ii]nclude)'
> > (2) --language-force='cpp:include' --language-force='cpp:Include'
> >
> > If so, you think that (1) is better than (2) since it is shorter?
>
> Yes precisely. Although perhaps I gave a rather weak example. A stronger
> case would be when differentiating between say:
> /usr/include/C++/4.8/algorithm
> /usr/include/C++/5.1/algorithm
> /usr/include/C++//algorithm
> And:
> ./project/helper-programs/algorithm/sort/qsort  <- script or
> binary
>
> Or to match:
> .../include/sys
> But not:
> .../include/system_errors
>
> If I wanted to catch the first set of files in both example without
> tripping up over the second then I could do --language-force=cpp:(algorithm\$)
> and --language-force=cpp:(sys\$).
>
> >
> > Q2: Does (1) above match to the followings?
> >
> > ./XXXincludeYYY/
> > ./XXXincludeYYY.php
> > ./project/include/release/
> > ./project/include/release/test.php
>
> Yes. The matching is a dumb substring or regex match on the path string
> available around where decide_lang() is called. No anchoring by default.
>
> >
> > Q3: Regex '^' and '$' are available? If so, what does they mean?
>
> Yes they are. `^' would mean start matching at the beginning of the path
> and `$' would mean match the end of the path (particularly useful for just
> picking up matches against a file name as directories in themselves aren't
> processed beyond traversal). File globbing doesn't make ^ and $ available
> and I h

RE: GNU Global Parsing Suffixless Files Patch

2016-10-03 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 


Good morning :-) (See comments below)

> -Original Message-
> From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of 
> Shigio YAMAGUCHI
> Sent: 01 October 2016 00:17
> To: Cooper, Anthony
> Cc: bug-global@gnu.org
> Subject: Re: GNU Global Parsing Suffixless Files Patch
> 
> Before implementation, I would like to make clear the specification.
> 
> > Assorted projects I've come across have include and Include (the 
> > example below is a trivial but a real one relating to MS-Windows) 
> > and some even have include dirs names XInclude or something similar 
> > (can't remember the project now, wasn't X11 but probably an X client).
> 
> Let me ask a couple of questions, please.
> 
> 
> Q1: Is the following (1) and (2) equal?
> 
> (1) --language-force='cpp:([Ii]nclude)'
> (2) --language-force='cpp:include' --language-force='cpp:Include'
> 
> If so, you think that (1) is better than (2) since it is shorter?

Yes precisely. Although perhaps I gave a rather weak example. A stronger case 
would be when differentiating between say:
/usr/include/C++/4.8/algorithm
/usr/include/C++/5.1/algorithm
/usr/include/C++//algorithm
And:
./project/helper-programs/algorithm/sort/qsort  <- script or binary

Or to match:
.../include/sys
But not:
.../include/system_errors

If I wanted to catch the first set of files in both example without tripping up 
over the second then I could do --language-force=cpp:(algorithm\$)  and 
--language-force=cpp:(sys\$).

> 
> Q2: Does (1) above match to the followings?
> 
> ./XXXincludeYYY/
> ./XXXincludeYYY.php
> ./project/include/release/
> ./project/include/release/test.php

Yes. The matching is a dumb substring or regex match on the path string 
available around where decide_lang() is called. No anchoring by default. 

> 
> Q3: Regex '^' and '$' are available? If so, what does they mean?

Yes they are. `^' would mean start matching at the beginning of the path and 
`$' would mean match the end of the path (particularly useful for just picking 
up matches against a file name as directories in themselves aren't processed 
beyond traversal). File globbing doesn't make ^ and $ available and I have come 
across other programs/situations where I have been frustrated by this for want 
of a regex. E.g. If I had:
Default: \
:GTAGS_OPTIONS=--force-language=yacc\:(sys\$): \
   --force-language='cpp\:(^\\./Microsoft Visual)':
Then this would say match all files ending in sys and treat them as yacc and 
any suffixless files with a path starting with `./Microsoft Visual' are to be 
treated as cpp files.

One thing to note, made in the man page and help text, is this switch won't 
affect any files with a suffux, which some people might expect with `force' in 
the name of the switch.

Did you correctly receive the new patch for 6.5.5?

Many thanks once again :-).

Regards Tony.
> 
> Regards,
> Shigio
> 
> --
> 
> Shigio YAMAGUCHI 
> PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3
> 
> 
> __
> 
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com 
> __
> 



Communications with GCHQ may be monitored and/or recorded 
for system efficiency and other lawful purposes. Any views or 
opinions expressed in this e-mail do not necessarily reflect GCHQ 
policy.  This email, and any attachments, is intended for the 
attention of the addressee(s) only. Its unauthorised use, 
disclosure, storage or copying is not permitted.  If you are not the
intended recipient, please notify postmas...@gchq.gsi.gov.uk.  

This information is exempt from disclosure under the Freedom of 
Information Act 2000 and may be subject to exemption under
other UK information legislation. Refer disclosure requests to 
GCHQ on 01242 221491 ext 30306 (non-secure) or email
info...@gchq.gsi.gov.uk



___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


Re: GNU Global Parsing Suffixless Files Patch

2016-09-30 Thread Shigio YAMAGUCHI
Before implementation, I would like to make clear the specification.

> Assorted projects I've come across have include and Include (the example
below is
> a trivial but a real one relating to MS-Windows) and some even have
include dirs
> names XInclude or something similar (can't remember the project now,
wasn't X11
> but probably an X client).

Let me ask a couple of questions, please.

Q1: Is the following (1) and (2) equal?

(1) --language-force='cpp:([Ii]nclude)'
(2) --language-force='cpp:inlucde' --language-force='cpp:Inlucde'

If so, you think that (1) is better than (2) since it is shorter?

Q2: Does (1) above match to the followings?

./XXXincludeYYY/
./XXXincludeYYY.php
./project/include/release/
./project/include/release/test.php

Q3: Regex '^' and '$' are available? If so, what does they mean?

Regards,
Shigio

-- 
Shigio YAMAGUCHI 
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3
___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


RE: GNU Global Parsing Suffixless Files Patch

2016-09-30 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 


Assorted projects I've come across have include and Include (the example below 
is a trivial but a real one relating to MS-Windows) and some even have include 
dirs names XInclude or something similar (can't remember the project now, 
wasn't X11 but probably an X client).

I have attached a patch that implements case 1 as described below, changes 
against 6.5.5.

Regards,

Tony.
-Original Message-
From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of Shigio 
YAMAGUCHI
Sent: 30 September 2016 01:29
To: Cooper, Anthony
Cc: bug-global@gnu.org
Subject: Re: GNU Global Parsing Suffixless Files Patch

>  I still feel the regex approach offers more flexibility. How about we 
> adopt the () approach you mentioned below such that:
>
>gtags --language-force=cpp\:project/include
>
>Is a real path as you suggest and:
>
>gtags --language-force=cpp\:([Ii]nclude)

In what kind of case is this useful?
If possible, would you please show me an example?

>
>is a regex pattern contained in ().
...
> This would also be consistent with ctags.

In ctags's langmap, the pattern is glob pattern not regex.

Regards,
Shigio

-- 

Shigio YAMAGUCHI 
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3


__
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com 
__



Communications with GCHQ may be monitored and/or recorded 
for system efficiency and other lawful purposes. Any views or 
opinions expressed in this e-mail do not necessarily reflect GCHQ 
policy.  This email, and any attachments, is intended for the 
attention of the addressee(s) only. Its unauthorised use, 
disclosure, storage or copying is not permitted.  If you are not the
intended recipient, please notify postmas...@gchq.gsi.gov.uk.  

This information is exempt from disclosure under the Freedom of 
Information Act 2000 and may be subject to exemption under
other UK information legislation. Refer disclosure requests to 
GCHQ on 01242 221491 ext 30306 (non-secure) or email
info...@gchq.gsi.gov.uk





global-6.5.5-suffixless-patch.tgz
Description: global-6.5.5-suffixless-patch.tgz
___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


Re: GNU Global Parsing Suffixless Files Patch

2016-09-29 Thread Shigio YAMAGUCHI
>  I still feel the regex approach offers more flexibility. How about we
adopt
> the () approach you mentioned below such that:
>
>gtags --language-force=cpp\:project/include
>
>Is a real path as you suggest and:
>
>gtags --language-force=cpp\:([Ii]nclude)

In what kind of case is this useful?
If possible, would you please show me an example?

>
>is a regex pattern contained in ().
...
> This would also be consistent with ctags.

In ctags's langmap, the pattern is glob pattern not regex.

Regards,
Shigio

-- 
Shigio YAMAGUCHI 
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3
___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


RE: GNU Global Parsing Suffixless Files Patch

2016-09-29 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 


Oops sorry just clicked on reply and not reply all...

-Original Message-
From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of Shigio 
YAMAGUCHI
Sent: 29 September 2016 13:14
To: Cooper, Anthony
Subject: Re: GNU Global Parsing Suffixless Files Patch

Hi Tony,
It seems that this mail is not posted to GLOBAL bug mailing list.
If possible, would you please post this mail to the bug-global@gnu.org?
I'd like to always argue about GLOBAL at the public place.
Thank you in advance.

Regards,
Shigio


2016-09-29 19:23 GMT+09:00 Cooper, Anthony :


SECURITY CLASSIFICATION: OFFICIAL


I still feel the regex approach offers more flexibility. How about we 
adopt the () approach you mentioned below such that:

gtags --language-force=cpp\:project/include

Is a real path as you suggest and:

gtags --language-force=cpp\:([Ii]nclude)

is a regex pattern contained in (). Since we are doing one entry per 
option we don't even need to worry about escaping () in the regex as we simply 
strip off the outer ones once we have worked out that it's a pattern. This 
would also be consistent with ctags.

I'll see what I can come up with... Do you want case 2 implemented or 
leave that as a separate additional feature?

Regards,

Tony.

-Original Message-
From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of 
Shigio YAMAGUCHI

Sent: 28 September 2016 02:51
To: Cooper, Anthony
Cc: bug-global@gnu.org
        Subject: Re: GNU Global Parsing Suffixless Files Patch

> - You refer to `C++' in your example, would it not be better to use 
`cpp'?

You are right.

> - Also the path matching. Wouldn't having this as a regex give the 
best of both worlds?
> If you wanted to match a file you could end in a $ as in
> 'boot-system$' or a directory component '/include/'. Otherwise you
> might find yourself trying to stipulate that a file `run-command' is 
Perl but also inadvertently mark everything under a `run-command'
> directory in the same project as Perl as well? I know it's a rather
> contrived example, but it could happen...

How about using a real path for the option?

gtags --language-force=cpp\:project/include
===
   ^
   This is not a pattern but a real path.

The user should know whether it is a directory or not. If the file or 
directory is not found, then gtags will display the following warning message:

$ gtags -w --language-force=cpp\:project/include
gtags: Warning: 'project/include' not found.

> If you're happy with the two points above would you like me to make a
> new patch or would you rather do it yourself?

If you hope it and you debug a program, I welcome that.
But you don't have to force yourself. It's a big help for us only by 
your participating in this argument.
I will put this issue on the TODO list.

Regards,
Shigio


2016-09-27 0:20 GMT+09:00 Cooper, Anthony 
mailto:anthony.coo...@gchq.gsi.gov.uk> >:


SECURITY CLASSIFICATION: OFFICIAL


Hi Shihio,

Thank you for replying.

The intention of my patch was to really address case 1. 
I did look at implementing something akin to scenario 2, but felt that was of 
less benefit (I admit a purely selfish decision! :-)).

I tried to think of the times when one would most 
likely want to use this feature and apart from the C++ STL case, probably the 
most likely case is when a project is using scripting languages in suffixless 
executable files; thus possibly leading to quite a large mixture of language 
exceptions in a project. Hence I opted for an approach that was good at 
handling potentially very long lists.

However looking at your idea it does have the value of 
consistency with ctags, offers greater flexibility and still meets my `put it 
in a config file' requirement, so yes your idea does look better.

I did notice a couple of things though:
- You refer to `C++' in your example, would it not be better to 
use `cpp'? As this is consistent with values used in the config file and can be 
validated against the list of s

Re: GNU Global Parsing Suffixless Files Patch

2016-09-27 Thread Shigio YAMAGUCHI
> - You refer to `C++' in your example, would it not be better to use `cpp'?

You are right.

> - Also the path matching. Wouldn't having this as a regex give the best
of both worlds?
> If you wanted to match a file you could end in a $ as in 'boot-system$'
or a directory
> component '/include/'. Otherwise you might find yourself trying to
stipulate that a file
> `run-command' is Perl but also inadvertently mark everything under a
`run-command'
> directory in the same project as Perl as well? I know it's a rather
contrived example,
> but it could happen...

How about using a real path for the option?

gtags --language-force=cpp\:project/include
===
   ^
   This is not a pattern but a real path.

The user should know whether it is a directory or not. If the file or
directory is not
found, then gtags will display the following warning message:

$ gtags -w --language-force=cpp\:project/include
gtags: Warning: 'project/include' not found.

> If you're happy with the two points above would you like me to make a new
patch or
> would you rather do it yourself?

If you hope it and you debug a program, I welcome that.
But you don't have to force yourself. It's a big help for us only by your
participating
in this argument.
I will put this issue on the TODO list.

Regards,
Shigio


2016-09-27 0:20 GMT+09:00 Cooper, Anthony :

> SECURITY CLASSIFICATION: OFFICIAL
>
>
> Hi Shihio,
>
> Thank you for replying.
>
> The intention of my patch was to really address case 1. I did look
> at implementing something akin to scenario 2, but felt that was of less
> benefit (I admit a purely selfish decision! :-)).
>
> I tried to think of the times when one would most likely want to
> use this feature and apart from the C++ STL case, probably the most likely
> case is when a project is using scripting languages in suffixless
> executable files; thus possibly leading to quite a large mixture of
> language exceptions in a project. Hence I opted for an approach that was
> good at handling potentially very long lists.
>
> However looking at your idea it does have the value of consistency
> with ctags, offers greater flexibility and still meets my `put it in a
> config file' requirement, so yes your idea does look better.
>
> I did notice a couple of things though:
> - You refer to `C++' in your example, would it not be better to use `cpp'?
> As this is consistent with values used in the config file and can be
> validated against the list of supported languages as I do for
> suffixless_lamgmap.
> - Also the path matching. Wouldn't having this as a regex give the best of
> both worlds? If you wanted to match a file you could end in a $ as in
> 'boot-system$' or a directory component '/include/'. Otherwise you might
> find yourself trying to stipulate that a file `run-command' is Perl but
> also inadvertently mark everything under a `run-command' directory in the
> same project as Perl as well? I know it's a rather contrived example, but
> it could happen...
>
> As for case 2, that's fine, although I'm not sure where one would
> use it though unless Global copes with Makefile, COPYING, CHANGELOG etc
> type files (which it may do via ctags for all I know :-)).
>
> BTW many thanks for Global, it's _such_ a useful tool :-).
>
> If you're happy with the two points above would you like me to
> make a new patch or would you rather do it yourself?
>
> Yours sincerely,
>
> Tony.
> -Original Message-
> From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of
> Shigio YAMAGUCHI
> Sent: 23 September 2016 02:20
> To: Cooper, Anthony
> Cc: bug-global@gnu.org
> Subject: Re: GNU Global Parsing Suffixless Files Patch
>
> Hi,
> Thank you for the patch and suggestion.
>
> I'd like to separate this issue in two cases.
>
> case1: Files under the specific directory like /usr/include/c++/4.2.1
>
> Should that be written on the configuration file as a rule?
> In fact shouldn't that be written on the command line of gtags?
>
> $ gtags --language-force=C++:include --language-force=C++:project/include
>
> --language-force=: If it is a file, it is
> considered  source file.
> If it is a directory, all files under it are considered  source
> file.
>
> If needed, you can make this option default:
>
> # Making project base gtags.conf
> $ echo 'default: GTAGS_OPTIONS=--language-force=C++\:include \
> --language-force=C++\:project/include' >gtags.c

RE: GNU Global Parsing Suffixless Files Patch

2016-09-26 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 


Hi Shihio,

Thank you for replying.

The intention of my patch was to really address case 1. I did look at 
implementing something akin to scenario 2, but felt that was of less benefit (I 
admit a purely selfish decision! :-)).

I tried to think of the times when one would most likely want to use 
this feature and apart from the C++ STL case, probably the most likely case is 
when a project is using scripting languages in suffixless executable files; 
thus possibly leading to quite a large mixture of language exceptions in a 
project. Hence I opted for an approach that was good at handling potentially 
very long lists.

However looking at your idea it does have the value of consistency with 
ctags, offers greater flexibility and still meets my `put it in a config file' 
requirement, so yes your idea does look better.

I did notice a couple of things though:
- You refer to `C++' in your example, would it not be better to use `cpp'? As 
this is consistent with values used in the config file and can be validated 
against the list of supported languages as I do for suffixless_lamgmap.
- Also the path matching. Wouldn't having this as a regex give the best of both 
worlds? If you wanted to match a file you could end in a $ as in 'boot-system$' 
or a directory component '/include/'. Otherwise you might find yourself trying 
to stipulate that a file `run-command' is Perl but also inadvertently mark 
everything under a `run-command' directory in the same project as Perl as well? 
I know it's a rather contrived example, but it could happen...

As for case 2, that's fine, although I'm not sure where one would use 
it though unless Global copes with Makefile, COPYING, CHANGELOG etc type files 
(which it may do via ctags for all I know :-)).

BTW many thanks for Global, it's _such_ a useful tool :-).

If you're happy with the two points above would you like me to make a 
new patch or would you rather do it yourself?

Yours sincerely,

Tony.
-Original Message-
From: i.tama...@gmail.com [mailto:i.tama...@gmail.com] On Behalf Of Shigio 
YAMAGUCHI
Sent: 23 September 2016 02:20
To: Cooper, Anthony
Cc: bug-global@gnu.org
Subject: Re: GNU Global Parsing Suffixless Files Patch

Hi,
Thank you for the patch and suggestion.

I'd like to separate this issue in two cases.

case1: Files under the specific directory like /usr/include/c++/4.2.1

Should that be written on the configuration file as a rule?
In fact shouldn't that be written on the command line of gtags?

$ gtags --language-force=C++:include --language-force=C++:project/include

--language-force=: If it is a file, it is 
considered  source file.
If it is a directory, all files under it are considered  source file.

If needed, you can make this option default:

# Making project base gtags.conf
$ echo 'default: GTAGS_OPTIONS=--language-force=C++\:include \ 
--language-force=C++\:project/include' >gtags.conf
$ gtags # gtags is executed as 'gtags --language-force=...'

case2: Files which have a specific name like 'Makefile'

Since it is considered as a rule, it should be written on the configuration 
file, I think. For example, 'Makefile' should be written like:

[gtags.conf]
# A pattern matches only to files.
:langmap=cpp\:([Mm]akefile).mk.mak:

'()' is a syntax of ctags(1)'s langmap. Since I borrowed 'langmap'
from ctags(1), I would like to copy that again.

What do you think?

Regards,
Shigio


2016-09-21 23:52 GMT+09:00 Cooper, Anthony :


SECURITY CLASSIFICATION: OFFICIAL


   Hi all,

I've included a patch that allows global to parse and index 
files without an extension (typically C++ header files, e.g. 
/usr/include/c++/4.8/algorithm and many more). This works by having a set of 
rules whereby a user can specify path regexs and the corresponding source file 
types for files without a suffix. This is done by specifying a rule, similar to 
langmap, like this:

default: \
:GTAGSFORCECPP: \
:suffixless_langmap=[iI]nclude\:cpp,project/include\:cpp:

This can be specified on multiple lines like langmap. Also the 
regex can be used to match any part of a path, including the filename if 
necessary.

I know you want patches that apply to the head of your main 
branch but unfortunately our organisation's firewall prevents me from 
connecting to your CVS server, so they are based on your latest release (6.5.4).

Regards,

Tony Cooper.



Communications with GCHQ may be monito

Re: GNU Global Parsing Suffixless Files Patch

2016-09-22 Thread Shigio YAMAGUCHI
Hi,
Thank you for the patch and suggestion.

I'd like to separate this issue in two cases.

case1: Files under the specific directory like /usr/include/c++/4.2.1

Should that be written on the configuration file as a rule?
In fact shouldn't that be written on the command line of gtags?

$ gtags --language-force=C++:include --language-force=C++:project/include

--language-force=:
If it is a file, it is considered  source file.
If it is a directory, all files under it are considered 
source file.

If needed, you can make this option default:

# Making project base gtags.conf
$ echo 'default: GTAGS_OPTIONS=--language-force=C++\:include \
--language-force=C++\:project/include' >gtags.conf
$ gtags # gtags is executed as 'gtags --language-force=...'

case2: Files which have a specific name like 'Makefile'

Since it is considered as a rule, it should be written on the configuration
file, I think. For example, 'Makefile' should be written like:

[gtags.conf]
# A pattern matches only to files.
:langmap=cpp\:([Mm]akefile).mk.mak:

'()' is a syntax of ctags(1)'s langmap. Since I borrowed 'langmap'
from ctags(1), I would like to copy that again.

What do you think?

Regards,
Shigio


2016-09-21 23:52 GMT+09:00 Cooper, Anthony :

> SECURITY CLASSIFICATION: OFFICIAL
>
>
>Hi all,
>
> I've included a patch that allows global to parse and index files
> without an extension (typically C++ header files, e.g.
> /usr/include/c++/4.8/algorithm and many more). This works by having a set
> of rules whereby a user can specify path regexs and the corresponding
> source file types for files without a suffix. This is done by specifying a
> rule, similar to langmap, like this:
>
> default: \
> :GTAGSFORCECPP: \
> :suffixless_langmap=[iI]nclude\:cpp,project/include\:cpp:
>
> This can be specified on multiple lines like langmap. Also the
> regex can be used to match any part of a path, including the filename if
> necessary.
>
> I know you want patches that apply to the head of your main branch
> but unfortunately our organisation's firewall prevents me from connecting
> to your CVS server, so they are based on your latest release (6.5.4).
>
> Regards,
>
> Tony Cooper.
>
> 
> 
> Communications with GCHQ may be monitored and/or recorded
> for system efficiency and other lawful purposes. Any views or
> opinions expressed in this e-mail do not necessarily reflect GCHQ
> policy.  This email, and any attachments, is intended for the
> attention of the addressee(s) only. Its unauthorised use,
> disclosure, storage or copying is not permitted.  If you are not the
> intended recipient, please notify postmas...@gchq.gsi.gov.uk.
>
> This information is exempt from disclosure under the Freedom of
> Information Act 2000 and may be subject to exemption under
> other UK information legislation. Refer disclosure requests to
> GCHQ on 01242 221491 ext 30306 (non-secure) or email
> info...@gchq.gsi.gov.uk
>
> 
> 
>
>
> ___
> Bug-global mailing list
> Bug-global@gnu.org
> https://lists.gnu.org/mailman/listinfo/bug-global
>
>


-- 
Shigio YAMAGUCHI 
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3
___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global


GNU Global Parsing Suffixless Files Patch

2016-09-21 Thread Cooper, Anthony
SECURITY CLASSIFICATION: OFFICIAL 


   Hi all,

I've included a patch that allows global to parse and index files 
without an extension (typically C++ header files, e.g. 
/usr/include/c++/4.8/algorithm and many more). This works by having a set of 
rules whereby a user can specify path regexs and the corresponding source file 
types for files without a suffix. This is done by specifying a rule, similar to 
langmap, like this:

default: \
:GTAGSFORCECPP: \
:suffixless_langmap=[iI]nclude\:cpp,project/include\:cpp:

This can be specified on multiple lines like langmap. Also the regex 
can be used to match any part of a path, including the filename if necessary.

I know you want patches that apply to the head of your main branch but 
unfortunately our organisation's firewall prevents me from connecting to your 
CVS server, so they are based on your latest release (6.5.4).

Regards,

Tony Cooper.


Communications with GCHQ may be monitored and/or recorded 
for system efficiency and other lawful purposes. Any views or 
opinions expressed in this e-mail do not necessarily reflect GCHQ 
policy.  This email, and any attachments, is intended for the 
attention of the addressee(s) only. Its unauthorised use, 
disclosure, storage or copying is not permitted.  If you are not the
intended recipient, please notify postmas...@gchq.gsi.gov.uk.  

This information is exempt from disclosure under the Freedom of 
Information Act 2000 and may be subject to exemption under
other UK information legislation. Refer disclosure requests to 
GCHQ on 01242 221491 ext 30306 (non-secure) or email
info...@gchq.gsi.gov.uk





suffixless-patch.tgz
Description: suffixless-patch.tgz
___
Bug-global mailing list
Bug-global@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-global