Please note that I gave the example of "/perl" pattern for shbang header for
perl.
The #! idiom often uses "/usr/bin/env" before the name of the
interpreter. For example, "#!/usr/bin/env perl".
"bin\boost\libs\python\build\boost_python.dll\vc-7_1\debug\threading-multi\numeric.obj"
Command file, not python
"/python" doesn't match it.
"/python" doesn't match this start line:
#!/usr/bin/env python
There should be alternative patterns per language.
Then you could make "/python", "/env python", "/pythond" etc. point
to the same language Python.
Your statement was "I think that any sequence of characters should
be matched anywhere in the first line". I'm pointing out that that is
not going to work for many files.
Naturally it's not perfect, but it gives much more than matching
whole words. Multiple patterns per language matched anywhere
in the first line are good enough for all the examples you gave so far.
I want it to be harder to start to match ensuring fewer incorrect matches.
An incorrect match is a price to pay for this simple and general method.
A text file with a strange extension, that starts with "The perl interpreter
is in /usr/bin/perl" would be recognized as a perl source. But if this
text file contains lots of Perl code examples, this may be actually good.
What header for Lua should be recognized?
#!/usr/bin/env lua
Or reasonable variants. You can add options or point to a copy you
compiled into a private area.
Don't use first-line matching for C++.
Neither for batch files.
No first-line matching for Java.
Why not? I wouldn't mind C++ headers without extensions like the
standard string, vector, ... styling nicely.
I don't think you can reliable detect them by looking at the first line.
/* gmarkup.h - Simple XML-like string parser/writer
C, not XML
XML should be matched by the "<?xml " pattern.
So its a special case.
With my method it's not special. "<?xml " can be matched anywhere
in the first line, which isn't what the XML standard says,
but it's good enough.
<[EMAIL PROTECTED]>
HTML/ASP, not Javascript
That's obviously ASP ("<%@").
Or a JSP directive.
At least we know it's not plain JS. With full-word matching
we would only see "language" and "javascript".
Piotr
_______________________________________________
Scite-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scite-interest