Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-06 Thread Jan Nijtmans
2015-11-06 2:12 GMT+01:00 Richard Hipp:
> Offending code is here:
> https://www.fossil-scm.org/fossil/artifact/10cb5eb292?ln=40-43
>
> I guess sky5walk wants that to allow through any characters other than 
> 0x00

My guess is that the code in doc.c was written when the function
looks_like_binary() didn't exist yet. Should be fixed now:
 

Whatever definition of BINARY is chosen, I think it should be
the same everywhere within fossil.

Thanks!
 Jan Nijtmans
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-06 Thread sky5walk
Thanks for the fix!
I will try to compile it when I get a spare moment...which is in short
supply.

On Fri, Nov 6, 2015 at 3:16 AM, Jan Nijtmans  wrote:

> 2015-11-06 2:12 GMT+01:00 Richard Hipp:
> > Offending code is here:
> > https://www.fossil-scm.org/fossil/artifact/10cb5eb292?ln=40-43
> >
> > I guess sky5walk wants that to allow through any characters other than
> 0x00
>
> My guess is that the code in doc.c was written when the function
> looks_like_binary() didn't exist yet. Should be fixed now:
>  
>
> Whatever definition of BINARY is chosen, I think it should be
> the same everywhere within fossil.
>
> Thanks!
>  Jan Nijtmans
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Warren Young
On Nov 5, 2015, at 11:37 AM, sky5w...@gmail.com wrote:
> 
> I am also trapped with this binary file detection for the egregious use of 
> ascii characters 2 and 6 in my code. :(

What does “fossil test-looks-like-utf filename” say for that file?
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread sky5walk
No, I just deleted ascii characters 2 and 6 from the file and Fossil now
shows the file as text. I will have to build this ascii string in code
instead of pasting from hex editor. But, it would be cool to set a range of
acceptable ascii characters = text. Ex. ascii 1-127 = text.

On Thu, Nov 5, 2015 at 2:16 PM, Jan Nijtmans  wrote:

> 2015-11-05 19:37 GMT+01:00  :
> > Hi,
> > I am also trapped with this binary file detection for the egregious use
> of
> > ascii characters 2 and 6 in my code. :(
> >
> > ;//  ascii2+sometexthere+ascii6
> > ;//sometexthere ;<-- pasting here does not show the prefix and suffix
> > ascii characters.
>
> As far as I know, fossil doesn't use control characters to decide
> the file is binary, only the null-byte. So I think something else
> is triggering the binary detection. Too long lines, maybe?
>
> Regards,
>Jan Nijtmans
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Jan Nijtmans
2015-11-05 19:37 GMT+01:00  :
> Hi,
> I am also trapped with this binary file detection for the egregious use of
> ascii characters 2 and 6 in my code. :(
>
> ;//  ascii2+sometexthere+ascii6
> ;//sometexthere ;<-- pasting here does not show the prefix and suffix
> ascii characters.

As far as I know, fossil doesn't use control characters to decide
the file is binary, only the null-byte. So I think something else
is triggering the binary detection. Too long lines, maybe?

Regards,
   Jan Nijtmans
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread sky5walk
No, saving the file to utf-8 + BOM did not prompt Fossil to trigger text.
And "decent" is a relative term.
;// Temp Tol ±°C ;<-- Ansi display :)
;// Temp Tol [xB1][xB0]C ;<-- UTF-8+BOM display :(

On Thu, Nov 5, 2015 at 5:43 PM,  wrote:

> Thanks for looking at this.
> Attached are a small repo(anonymous=setup, password=fossil) and its 2 text
> files.
>
> Thanks for Fossil,
> Steve
>
> On Thu, Nov 5, 2015 at 5:20 PM, Richard Hipp  wrote:
>
>> On 11/5/15, sky5w...@gmail.com  wrote:
>> > Well, I have a workaround(no pasted literal strings). I just didn't
>> realize
>> > Ascii characters within 1-255 could trigger binary?
>> >
>> > Maybe a fast histogram, and a count of << 1 or 2% for these ascii
>> > characters allows text. Or let the user define the valid range.
>>
>> Please send me the actual file that is causing problems.
>>
>> >
>> > By the way, Notepad, Notepad++, Visual Studio, etc. have identical
>> > renderings for these characters and consider the file ansi text.
>> >
>> > On Thu, Nov 5, 2015 at 4:31 PM, Stephan Beal 
>> wrote:
>> >
>> >> On Thu, Nov 5, 2015 at 9:52 PM,  wrote:
>> >>
>> >>> Haha, it would be quite a mess if $ and @ triggered binary.
>> >>> I see no reason to kick the file to binary if the ascii code < 128?
>> >>>
>> >>
>> >> fwiw...
>> >>
>> >> [stephan@host:~/bin]$ hexdump fossil | head
>> >> 000 457f 464c 0102 0001    
>> >> 010 0002 003e 0001  7be2 0040  
>> >> 020 0040    0b38 0094  
>> >> 030   0040 0038 0009 0040 0025 0022
>> >> 040 0006  0005  0040   
>> >> 050 0040 0040   0040 0040  
>> >> 060 01f8    01f8   
>> >> 070 0008    0003  0004 
>> >> 080 0238    0238 0040  
>> >> 090 0238 0040   001c   
>> >>
>> >> (Assumption: that's "probably" typical of a typical binary file.)
>> >>
>> >> i see only 2 bytes there which are >127d (specifically, 0xf8 and 0xe2),
>> >> and lots below 32d. Plus i see a few 6's and 2's. i think it's
>> >> unreasonable
>> >> (=highly unconventional) to expect fossil to treat those bytes as
>> "text."
>> >> 0x02 is, according to my local man pages, the "start of text" (control)
>> >> character, which places it implicitly outside the range of bytes used
>> by
>> >> "text."
>> >>
>> >> --
>> >> - stephan beal
>> >> http://wanderinghorse.net/home/stephan/
>> >> http://gplus.to/sgbeal
>> >> "Freedom is sloppy. But since tyranny's the only guaranteed byproduct
>> of
>> >> those who insist on a perfect world, freedom will have to do." -- Bigby
>> >> Wolf
>> >>
>> >> ___
>> >> fossil-users mailing list
>> >> fossil-users@lists.fossil-scm.org
>> >> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>> >>
>> >>
>> >
>>
>>
>> --
>> D. Richard Hipp
>> d...@sqlite.org
>> ___
>> fossil-users mailing list
>> fossil-users@lists.fossil-scm.org
>> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>>
>
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread sky5walk
Haha, it would be quite a mess if $ and @ triggered binary.
I see no reason to kick the file to binary if the ascii code < 128?

On Thu, Nov 5, 2015 at 3:46 PM,  wrote:

> No difference besides num bytes with or without the embedded Ascii
> characters 2 and 6.
> I add this 1 line to my file and it triggers binary?!
> [Asc2]+"123 "+[Asc6][CR+LF]
>
> c:\tryfossil>fossil test-looks-like-utf myfile.txt
> File "myfile.txt" has 121343 bytes.
> Starts with UTF-8 BOM: no
> Starts with UTF-16 BOM: no
> Looks like UTF-8: yes
> Has flag LOOK_NUL: no
> Has flag LOOK_CR: yes
> Has flag LOOK_LONE_CR: no
> Has flag LOOK_LF: yes
> Has flag LOOK_LONE_LF: no
> Has flag LOOK_CRLF: yes
> Has flag LOOK_LONG: no
> Has flag LOOK_INVALID: yes
> Has flag LOOK_ODD: no
> Has flag LOOK_SHORT: no
>
> On Thu, Nov 5, 2015 at 3:21 PM, Warren Young  wrote:
>
>> On Nov 5, 2015, at 11:37 AM, sky5w...@gmail.com wrote:
>> >
>> > I am also trapped with this binary file detection for the egregious use
>> of ascii characters 2 and 6 in my code. :(
>>
>> What does “fossil test-looks-like-utf filename” say for that file?
>> ___
>> fossil-users mailing list
>> fossil-users@lists.fossil-scm.org
>> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>>
>
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Scott Robison
On Thu, Nov 5, 2015 at 3:09 PM,  wrote:

> Well, I have a workaround(no pasted literal strings). I just didn't
> realize Ascii characters within 1-255 could trigger binary?
>
> Maybe a fast histogram, and a count of << 1 or 2% for these ascii
> characters allows text. Or let the user define the valid range.
>
> By the way, Notepad, Notepad++, Visual Studio, etc. have identical
> renderings for these characters and consider the file ansi text.
>

Hmmm. I'm surprised that Google Chrome displays them in the same manner as
my text editors. Certainly isn't "Unicode compliant" since Unicode doesn't
assign glyphs to those code points, but it does display them.

-- 
Scott Robison
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Warren Young
On Nov 5, 2015, at 3:54 PM, sky5w...@gmail.com wrote:
> 
> And "decent" is a relative term.

No, it’s a value judgment.  I judge that a text editor that can’t handle UTF-8 
is indecent. :)

> ;// Temp Tol ±°C ;<-- Ansi display :)
> ;// Temp Tol [xB1][xB0]C ;<-- UTF-8+BOM display :(

That isn’t a conversion from ANSI to UTF-8, it’s just sticking a BOM on the 
front of an ANSI file.  The proper encoding of plus-minus + degrees would be 
[C2][B1][C2][B0].  Four bytes, not two.

If you can’t work out how to get your text editor to do this conversion, the 
iconv tool you can install with Cygwin will do it, via the following command:

   $ iconv -f MS-ANSI -t UTF-8 < original-file > new-file

The new file should be considerably larger than the old, since every 
single-byte ANSI code point over 127 will be encoded by 2-4 bytes in UTF-8.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread sky5walk
Yes yes, I am painfully aware of the BOM and the encoding steps.
Notepad++ has a simple menu click for this.
Despite all combinations, Fossil considers the file binary.

On Thu, Nov 5, 2015 at 6:33 PM, Warren Young  wrote:

> On Nov 5, 2015, at 3:54 PM, sky5w...@gmail.com wrote:
> >
> > And "decent" is a relative term.
>
> No, it’s a value judgment.  I judge that a text editor that can’t handle
> UTF-8 is indecent. :)
>
> > ;// Temp Tol ±°C ;<-- Ansi display :)
> > ;// Temp Tol [xB1][xB0]C ;<-- UTF-8+BOM display :(
>
> That isn’t a conversion from ANSI to UTF-8, it’s just sticking a BOM on
> the front of an ANSI file.  The proper encoding of plus-minus + degrees
> would be [C2][B1][C2][B0].  Four bytes, not two.
>
> If you can’t work out how to get your text editor to do this conversion,
> the iconv tool you can install with Cygwin will do it, via the following
> command:
>
>$ iconv -f MS-ANSI -t UTF-8 < original-file > new-file
>
> The new file should be considerably larger than the old, since every
> single-byte ANSI code point over 127 will be encoded by 2-4 bytes in UTF-8.
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Richard Hipp
On 11/5/15, sky5w...@gmail.com  wrote:
> Hi,
> I am also trapped with this binary file detection for the egregious use of
> ascii characters 2 and 6 in my code. :(
>
> ;//  ascii2+sometexthere+ascii6
> ;//sometexthere ;<-- pasting here does not show the prefix and suffix
> ascii characters.
>
> I cannot see diff's or my source code now in fossil ui.
> Note, I do not want to use escape char's.
> Is there any chance or setting to let Fossil v134 detection logic use
> extended ascii character range?

Background for the list:  sky5walk sent me a sample file that
contained his two control characters.

What I did:  I checked the sample file into a test Fossil repository.
I then checked in a change to that file.  "fossil diff" works.
"fossil diff --tk" works.  "fossil ui" works and shows the diff on the
webpage.  I have so far been unable to get it to say that the file is
binary.

Theory: Perhaps sky5walk has a binary-glob setting that indicates that
the file in his repo is binary?
-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread sky5walk
Hi Dr Hipp, I also sent you "try2.fossil" with the offending file inside.
In that repo, I toggled binary-glob settings with no changes to the file
classification as binary.
You are correct, the diff works for the tiny example file, but not if you
try to view the entire file from the ui:
http://localhost:8080/tree?ci=tip

On Thu, Nov 5, 2015 at 6:42 PM, Richard Hipp  wrote:

> On 11/5/15, sky5w...@gmail.com  wrote:
> > Hi,
> > I am also trapped with this binary file detection for the egregious use
> of
> > ascii characters 2 and 6 in my code. :(
> >
> > ;//  ascii2+sometexthere+ascii6
> > ;//sometexthere ;<-- pasting here does not show the prefix and suffix
> > ascii characters.
> >
> > I cannot see diff's or my source code now in fossil ui.
> > Note, I do not want to use escape char's.
> > Is there any chance or setting to let Fossil v134 detection logic use
> > extended ascii character range?
>
> Background for the list:  sky5walk sent me a sample file that
> contained his two control characters.
>
> What I did:  I checked the sample file into a test Fossil repository.
> I then checked in a change to that file.  "fossil diff" works.
> "fossil diff --tk" works.  "fossil ui" works and shows the diff on the
> webpage.  I have so far been unable to get it to say that the file is
> binary.
>
> Theory: Perhaps sky5walk has a binary-glob setting that indicates that
> the file in his repo is binary?
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Stephan Beal
On Thu, Nov 5, 2015 at 9:52 PM,  wrote:

> Haha, it would be quite a mess if $ and @ triggered binary.
> I see no reason to kick the file to binary if the ascii code < 128?
>

fwiw...

[stephan@host:~/bin]$ hexdump fossil | head
000 457f 464c 0102 0001    
010 0002 003e 0001  7be2 0040  
020 0040    0b38 0094  
030   0040 0038 0009 0040 0025 0022
040 0006  0005  0040   
050 0040 0040   0040 0040  
060 01f8    01f8   
070 0008    0003  0004 
080 0238    0238 0040  
090 0238 0040   001c   

(Assumption: that's "probably" typical of a typical binary file.)

i see only 2 bytes there which are >127d (specifically, 0xf8 and 0xe2), and
lots below 32d. Plus i see a few 6's and 2's. i think it's unreasonable
(=highly unconventional) to expect fossil to treat those bytes as "text."
0x02 is, according to my local man pages, the "start of text" (control)
character, which places it implicitly outside the range of bytes used by
"text."

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread sky5walk
Well, I have a workaround(no pasted literal strings). I just didn't realize
Ascii characters within 1-255 could trigger binary?

Maybe a fast histogram, and a count of << 1 or 2% for these ascii
characters allows text. Or let the user define the valid range.

By the way, Notepad, Notepad++, Visual Studio, etc. have identical
renderings for these characters and consider the file ansi text.

On Thu, Nov 5, 2015 at 4:31 PM, Stephan Beal  wrote:

> On Thu, Nov 5, 2015 at 9:52 PM,  wrote:
>
>> Haha, it would be quite a mess if $ and @ triggered binary.
>> I see no reason to kick the file to binary if the ascii code < 128?
>>
>
> fwiw...
>
> [stephan@host:~/bin]$ hexdump fossil | head
> 000 457f 464c 0102 0001    
> 010 0002 003e 0001  7be2 0040  
> 020 0040    0b38 0094  
> 030   0040 0038 0009 0040 0025 0022
> 040 0006  0005  0040   
> 050 0040 0040   0040 0040  
> 060 01f8    01f8   
> 070 0008    0003  0004 
> 080 0238    0238 0040  
> 090 0238 0040   001c   
>
> (Assumption: that's "probably" typical of a typical binary file.)
>
> i see only 2 bytes there which are >127d (specifically, 0xf8 and 0xe2),
> and lots below 32d. Plus i see a few 6's and 2's. i think it's unreasonable
> (=highly unconventional) to expect fossil to treat those bytes as "text."
> 0x02 is, according to my local man pages, the "start of text" (control)
> character, which places it implicitly outside the range of bytes used by
> "text."
>
> --
> - stephan beal
> http://wanderinghorse.net/home/stephan/
> http://gplus.to/sgbeal
> "Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
> those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
>
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Scott Robison
On Thu, Nov 5, 2015 at 12:16 PM, Jan Nijtmans 
wrote:

> 2015-11-05 19:37 GMT+01:00  :
> > Hi,
> > I am also trapped with this binary file detection for the egregious use
> of
> > ascii characters 2 and 6 in my code. :(
> >
> > ;//  ascii2+sometexthere+ascii6
> > ;//sometexthere ;<-- pasting here does not show the prefix and suffix
> > ascii characters.
>
> As far as I know, fossil doesn't use control characters to decide
> the file is binary, only the null-byte. So I think something else
> is triggering the binary detection. Too long lines, maybe?
>

Fossil has in the past definitely considered control codes as binary (other
than tab, cr, lf, maybe a couple others).

If you are dealing with C or C++ source code (and I suspect others), those
"bad" control codes are not part of the official source code character set
defined by the standard. Even if they were officially allowed, how should
they be rendered in the user interface? They are non-printing characters by
definition.

Just out of curiosity, why does the OP not want to use escape sequences
which have well defined behavior for both of the above languages and well
defined semantics for printing on a display?

Note: C & C++ technically do not include the characters $ and @ in their
basic source character set. They are typically allowed in source (in
character or string literals) because the standard also requires the
environment to document how they map source characters to the basic source
character set, and practically most compilers treat source as an 8 bit
stream of characters, so all bytes are valid characters under the "how are
they mapped" proviso.

Still, it is awkward to determine how one should render characters like
ASCII STX and ACK, particularly by a web browser based environment that
does not define a rendering for character codes less than 32 (none of
ASCII, ISO-8859, or Unicode have a glyph to represent those). Certainly
fossil could come up with some sort of quoting escape mechanism to show
such characters, but it would be non-standard by definition.

-- 
Scott Robison
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Richard Hipp
On 11/5/15, sky5w...@gmail.com  wrote:
> Well, I have a workaround(no pasted literal strings). I just didn't realize
> Ascii characters within 1-255 could trigger binary?
>
> Maybe a fast histogram, and a count of << 1 or 2% for these ascii
> characters allows text. Or let the user define the valid range.

Please send me the actual file that is causing problems.

>
> By the way, Notepad, Notepad++, Visual Studio, etc. have identical
> renderings for these characters and consider the file ansi text.
>
> On Thu, Nov 5, 2015 at 4:31 PM, Stephan Beal  wrote:
>
>> On Thu, Nov 5, 2015 at 9:52 PM,  wrote:
>>
>>> Haha, it would be quite a mess if $ and @ triggered binary.
>>> I see no reason to kick the file to binary if the ascii code < 128?
>>>
>>
>> fwiw...
>>
>> [stephan@host:~/bin]$ hexdump fossil | head
>> 000 457f 464c 0102 0001    
>> 010 0002 003e 0001  7be2 0040  
>> 020 0040    0b38 0094  
>> 030   0040 0038 0009 0040 0025 0022
>> 040 0006  0005  0040   
>> 050 0040 0040   0040 0040  
>> 060 01f8    01f8   
>> 070 0008    0003  0004 
>> 080 0238    0238 0040  
>> 090 0238 0040   001c   
>>
>> (Assumption: that's "probably" typical of a typical binary file.)
>>
>> i see only 2 bytes there which are >127d (specifically, 0xf8 and 0xe2),
>> and lots below 32d. Plus i see a few 6's and 2's. i think it's
>> unreasonable
>> (=highly unconventional) to expect fossil to treat those bytes as "text."
>> 0x02 is, according to my local man pages, the "start of text" (control)
>> character, which places it implicitly outside the range of bytes used by
>> "text."
>>
>> --
>> - stephan beal
>> http://wanderinghorse.net/home/stephan/
>> http://gplus.to/sgbeal
>> "Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
>> those who insist on a perfect world, freedom will have to do." -- Bigby
>> Wolf
>>
>> ___
>> fossil-users mailing list
>> fossil-users@lists.fossil-scm.org
>> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>>
>>
>


-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Scott Robison
On Thu, Nov 5, 2015 at 1:52 PM,  wrote:

> Haha, it would be quite a mess if $ and @ triggered binary.
> I see no reason to kick the file to binary if the ascii code < 128?
>

Not an unreasonable point of view, but the question becomes: How do you
render character codes less than ASCII SPACE? CR, LF, TAB all have well
defined meanings. How should the rest be rendered?

-- 
Scott Robison
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread sky5walk
No difference besides num bytes with or without the embedded Ascii
characters 2 and 6.
I add this 1 line to my file and it triggers binary?!
[Asc2]+"123"+[Asc6][CR+LF]

c:\tryfossil>fossil test-looks-like-utf myfile.txt
File "myfile.txt" has 121343 bytes.
Starts with UTF-8 BOM: no
Starts with UTF-16 BOM: no
Looks like UTF-8: yes
Has flag LOOK_NUL: no
Has flag LOOK_CR: yes
Has flag LOOK_LONE_CR: no
Has flag LOOK_LF: yes
Has flag LOOK_LONE_LF: no
Has flag LOOK_CRLF: yes
Has flag LOOK_LONG: no
Has flag LOOK_INVALID: yes
Has flag LOOK_ODD: no
Has flag LOOK_SHORT: no

On Thu, Nov 5, 2015 at 3:21 PM, Warren Young  wrote:

> On Nov 5, 2015, at 11:37 AM, sky5w...@gmail.com wrote:
> >
> > I am also trapped with this binary file detection for the egregious use
> of ascii characters 2 and 6 in my code. :(
>
> What does “fossil test-looks-like-utf filename” say for that file?
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Richard Hipp
On 11/5/15, Richard Hipp  wrote:
> On 11/5/15, sky5w...@gmail.com  wrote:
>> Hi Dr Hipp, I also sent you "try2.fossil" with the offending file inside.
>> In that repo, I toggled binary-glob settings with no changes to the file
>> classification as binary.
>> You are correct, the diff works for the tiny example file, but not if you
>> try to view the entire file from the ui:
>> http://localhost:8080/tree?ci=tip
>>
>
> I see now.
>
> The diff's are working fine.  But the documentation viewer is rather
> more fussy about what it considers binary.  Probably we just need to
> update the logic in the document viewer.
>

Offending code is here:
https://www.fossil-scm.org/fossil/artifact/10cb5eb292?ln=40-43

I guess sky5walk wants that to allow through any characters other than 0x00

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2015-11-05 Thread Richard Hipp
On 11/5/15, sky5w...@gmail.com  wrote:
> Hi Dr Hipp, I also sent you "try2.fossil" with the offending file inside.
> In that repo, I toggled binary-glob settings with no changes to the file
> classification as binary.
> You are correct, the diff works for the tiny example file, but not if you
> try to view the entire file from the ui:
> http://localhost:8080/tree?ci=tip
>

I see now.

The diff's are working fine.  But the documentation viewer is rather
more fussy about what it considers binary.  Probably we just need to
update the logic in the document viewer.

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread Stephan Beal
On Sun, Jun 24, 2012 at 5:26 PM, James Bremner ravenspo...@yahoo.comwrote:

 2.  Can I persuade fossil that this file is really text?


Can you try to re-commit a clean copy on top of it? i won't swear that
will work, but it might be worth a try.

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread Richard Hipp
On Sun, Jun 24, 2012 at 11:26 AM, James Bremner ravenspo...@yahoo.comwrote:

 One of my source code files is considered by fossil to be binary.

 Artifact 9c62b051e82f735a46a249047a2b02374247e991

 File unit_test/unit_test.cpp
 2012-06-22 20:53:14 - part of checkin [a68d8947a6] on branch trunk - cVMG
 class
 gracefully handles no GPS data; cVMG unit test (user: James ) [annotate]
 (file is 9581 bytes of binary data)

 This isn't a huge problem, but I would like to know:

 1.  Why did this happen?


Your file either contains a \000 character, or else it has a single line of
text that is longer than 8191 characters.



 2.  Can I persuade fossil that this file is really text?


The only part of Fossil that cares is the diff logic.  And, no, if your
file contains a \000 character or a line longer than 8191 charaters, then
there is no way to convince diff to ignore that fact.  The diff logic
is used for the diff command (obviously) but also for annotate and
merge.




 James


 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users




-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread James Bremner

 On Sun, Jun 24, 2012 at 5:26 PM, James Bremner ravenspoint-
/e1597as9lqavxtiumw...@public.gmane.org wrote:
 2.  Can I persuade fossil that this file is really text?
 
 
 Can you try to re-commit a clean copy on top of it? i won't swear that will 
work, but it might be worth a try.
 
 -- - stephan 

Thanks for the suggestion.  I deleted a whitespace and committed.  File is 
still 
'binary' by fossil


Artifact 00f9289a16606ebfc37e5235facdba438716f9fa
File unit_test/unit_test.cpp
2012-06-24 17:25:21 - part of checkin [437384f3ee] on branch trunk - mod 
whitespace to force new commit of test_unit.cpp (user: James ) [annotate]
(file is 9580 bytes of binary data)


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread James Bremner
Richard Hipp drh@... writes:

 1.  Why did this happen?
 
 
 Your file either contains a \000 character, or else it has a single line of 
text that is longer than 8191 characters. 
 
 

The longest line is 96 characters.

I am not sure how to confirm there is no \000 character

 
 
 2.  Can I persuade fossil that this file is really text?
 
 
 The only part of Fossil that cares is the diff logic.  And, no, if your 
 file 
contains a \000 character or a line longer than 8191 charaters, then there is 
no 
way to convince diff to ignore that fact.  The diff logic is used for the 
diff command (obviously) but also for annotate and merge. 
 
 

My immediate concern is that the web interface no longer displays the code.  
However, the diff and the annotate displays in the web interface are fine.

James


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread James Bremner

 
 Your file either contains a \000 character, or else it has a single line of 
text that is longer than 8191 characters. 
 

The longest line is less than 100 chars.

I wrote a quick program to check for null characters - there do not seem to be 
any. ( code below )

It is definitely something about the file.  If I create a new repository and 
check this file into it, it is consider binary

Here's the null check code


int buflen = 100;
char * buf = (char * )malloc( buflen );

FILE * fp = fopen(unit_test.cpp,r);
if( ! fp ) {
printf(No file\n);
return 1;
}

int len = fread(buf,1,buflen,fp);
if( len  buflen - 10 ) {
printf(longer buf please\n);
return 1;
}
printf(File length %d\n,len);

bool found_null = false;
for( int k = 0; k  len; k++ ) {
if( buf[k] == '\0' ) {
printf(NULL character!\n);
found_null = true;
}
}
if( ! found_null ) {
printf( OK no nulls found\n);
}


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread Richard Hipp
On Sun, Jun 24, 2012 at 11:26 AM, James Bremner ravenspo...@yahoo.comwrote:

 One of my source code files is considered by fossil to be binary.

 Artifact 9c62b051e82f735a46a249047a2b02374247e991

 File unit_test/unit_test.cpp
 2012-06-22 20:53:14 - part of checkin [a68d8947a6] on branch trunk - cVMG
 class
 gracefully handles no GPS data; cVMG unit test (user: James ) [annotate]
 (file is 9581 bytes of binary data)


Can you please show me the command that is failing for you.  If possible,
please send me a copy of the file that Fossil thinks is binary.



 This isn't a huge problem, but I would like to know:

 1.  Why did this happen?

 2.  Can I persuade fossil that this file is really text?

 James


 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users




-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread James Bremner
Richard Hipp drh@... writes:


 
 Can you please show me the command that is failing for you.
  If possible, please 
send me a copy of the file that Fossil thinks is binary. 
 
 
 

Thank you for your attention to this.

I have sent an email with the file attached to 

d...@sqlite.org

James


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread James Bremner


You can get the file from and see the command at

https://chiselapp.com/user/ravenspoint/repository/test_binary_text/dir?ci=tip

James

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread Richard Hipp
On Sun, Jun 24, 2012 at 4:14 PM, James Bremner ravenspo...@yahoo.comwrote:


 Thank you for your attention to this.

 I have sent an email with the file attached to


Thanks for sending the file.

The diff logic determines binary as I described previously:  Files that
do not contain \000 and with no line longer than 8191 characters.  However,
the artifact webpage is a little more restrictive.  For the artifact
page a binary file is one that contains any of several control characters.
See http://www.fossil-scm.org/fossil/artifact/ac5a32d3?ln=40-43 for the
list of character codes that signal to Fossil that the file is binary.  In
your case there is a Ctrl-B (ascii 0x02) in the 2150th byte of the file,
which makes Fossil think it is a binary file.

I'm open to tweaking the mimetype_from_content() function if anybody has
any better suggestions on how to determine the mimetype of a file based on
its content.  The function has worked reasonably well, historically, but
clearly falls down for unit_test.cpp.

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread James Bremner
Richard Hipp drh@... writes:

  In your case there is a Ctrl-B (ascii 0x02) in the 2150th byte of the file,
 which makes Fossil think it is a binary file.

Thank you for clarifying this mystery.

FYI: ascii 0x02 is STX = Start of Text  It is used by many devices that 
communicate over RS232 to demark the beginning if a message.  This code is
used to parse messages from such devices and so the character is sprinkled all 
through it.

Other such codes freuently used are:

1   001 01  0001SOH Start of Heading
2   002 02  0010STX Start of Text
3   003 03  0011ETX End of Text
4   004 04  0100EOT End of Transmission

James


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] source code file is considered by fossil to be binary.

2012-06-24 Thread Scott Robison
It might be better (more portable) to escape those as octal or hex
sequences (like '\002' or '\x02').
On Jun 24, 2012 3:11 PM, James Bremner ravenspo...@yahoo.com wrote:

 Richard Hipp drh@... writes:

   In your case there is a Ctrl-B (ascii 0x02) in the 2150th byte of the
 file,
  which makes Fossil think it is a binary file.

 Thank you for clarifying this mystery.

 FYI: ascii 0x02 is STX = Start of Text  It is used by many devices that
 communicate over RS232 to demark the beginning if a message.  This code is
 used to parse messages from such devices and so the character is sprinkled
 all
 through it.

 Other such codes freuently used are:

 1   001 01  0001SOH Start of Heading
 2   002 02  0010STX Start of Text
 3   003 03  0011ETX End of Text
 4   004 04  0100EOT End of Transmission

 James


 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users