[fossil-users] how to report bug in fossil
Hi, I've found what seems to be a minor bug in fossil (details below), and I'm not sure what the procedure is for reporting it. Can someone enlighten me? The bug: In lookslike.c, invalid_utf8() returns 'invalid' for the input 0xE0, 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). This can be easily reproduced by trying to commit a file that contains those three bytes and nothing else - you will get the "this file contains invalid UTF-8..." warning. I replaced the code in invalid_utf8() with this UTF-8 validator: https://www.cl.cam.ac.uk/~mgk25/ucs/utf8_check.c and that worked OK, so I'm pretty sure invalid_utf8() is incorrect. Thanks, Ross ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
[fossil-users] How to search for changes in checkins?
Hello, Is it possible to search for actual changes in checkins? For example, if I have a repository that has a macro: #define MACRO { if (x) y; } Can I search Fossil for all checkins that included replacing code with MACRO as a change? Thanks, Andy -- TAI64 timestamp: 400057597ff0 ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
[fossil-users] Adding 'bz2' and 'bzip' to mimetypes
Hello, Any chance that 'bz2' and 'bzip' could be added to the 'official' fossil types? (http://fossil-scm.org/index.html/artifact/aad06cde815b7051?ln=71-288) Thanks, Tomek ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] How to search for changes in checkins?
On 6/9/16, Andy Bradfordwrote: > Hello, > > Is it possible to search for actual changes in checkins? For example, if > I have a repository that has a macro: > > #define MACRO { if (x) y; } > > Can I search Fossil for all checkins that included replacing code with > MACRO as a change? > Fossil does not support that feature at this time. The closest it comes is searching the text of the check-in comment. I've long thought that this would be a very useful feature to add. But there are technical questions to be answered and issues to overcome: (1) A full-text index uses roughly 30% as much disk space as the content that is being indexed. (This is a mathematical property of full-text search - not a deficiency in Fossil's implementation.) Doing full text search on all check-in diffs might result in a very large search index. (2) The default tokenizer for SQLite's full-text search engine is designed for ordinary human-readable text, not for program code. I don't know how well the search would work when applied to C-code. (3) A diff normally includes several lines of unchanged context before and after the lines that were modified. How much of this context should be included in search index? The current default for display is 6 lines. Do we want more or less than that in the search index? (4) Initializing the search index means computing all historical check-in diffs. That might take a while. Figure about 10 check-ins per second. Fossil itself currently has 9506 check-ins, so total initialization time would be 15 minutes. (The "10 checkins per second" number is a guess. Processing might go significantly faster. We will need to experiment to know.) -- D. Richard Hipp d...@sqlite.org ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
On Jun 9, 2016, at 6:25 AM, rosscann...@fastmail.com wrote: > > The bug: > In lookslike.c, invalid_utf8() returns 'invalid' for the input 0xE0, > 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). I took a look at that code, and there is no possibility for it to be correct. It doesn’t even try to consider 3- and 4- byte sequences. May I suggest that whoever rewrites this use BOOST_BINARY? http://www.boost.org/doc/libs/1_61_0/libs/utility/utility.htm#BOOST_BINARY Despite being from Boost, it is implemented purely in C preprocessor code, so it should work within Fossil. I make this suggestion because it seems to me that the key source of the error (errors?) in this code comes from trying to work at the hex level on a problem that is inherently about bitwise encoding. There’s an interesting discussion of the rationale for C not having a binary literal syntax here: http://stackoverflow.com/q/18244726 C++14 has one, though: https://en.wikipedia.org/wiki/C%2B%2B14#Binary_literals I don’t suppose Fossil could get away with using the nonstandard extensions supported by GCC and Clang? That won’t cover native Windows, but Visual C++ 2015 supports the C++14 syntax; I’ve tested it here, and it’s accepted in C code, too. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] How to search for changes in checkins?
On 6/9/16, Marko Käningwrote: > >> (4) Initializing the search index means computing all historical >> check-in diffs. That might take a while. Figure about 10 check-ins >> per second. Fossil itself currently has 9506 check-ins, so total >> initialization time would be 15 minutes. (The "10 checkins per >> second" number is a guess. Processing might go significantly faster. >> We will need to experiment to know.) > > One could make this indexing optional, perhaps, no? A bit like the > indexing of Wiki/Docu/Tickets info in the UI… Yes, certainly. It would be just another check-box option beside all of the other search categories. Someone also suggested that there ought to be a search for Fossil's own internal help screens and documentation. I agree. That wouldn't be hard to do - I just haven't gotten around to doing it yet... -- D. Richard Hipp d...@sqlite.org ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
I'm new to this list, so take my opinion lightly, but I would hesitate to introduce Boost into the fossil build if it's not there already. Boost adds a long list of complications to any build it's part of: - it's massive; - it has dependencies between its modules, so even if you just want a tiny part of it, you might need other parts; - it has a bewildering array of compiler options. One of the things I like about fossil is how quickly and easily it builds! It would be a shame to change that. I like the idea of bit-manipulation macros, but they would be quite easy to craft ad hoc. Ross On Fri, Jun 10, 2016, at 06:12 AM, Warren Young wrote: > On Jun 9, 2016, at 6:25 AM, rosscann...@fastmail.com wrote: > > > > The bug: > > In lookslike.c, invalid_utf8() returns 'invalid' for the input 0xE0, > > 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). > > I took a look at that code, and there is no possibility for it to be > correct. It doesn’t even try to consider 3- and 4- byte sequences. > > May I suggest that whoever rewrites this use BOOST_BINARY? > > http://www.boost.org/doc/libs/1_61_0/libs/utility/utility.htm#BOOST_BINARY > > Despite being from Boost, it is implemented purely in C preprocessor > code, so it should work within Fossil. > > I make this suggestion because it seems to me that the key source of the > error (errors?) in this code comes from trying to work at the hex level > on a problem that is inherently about bitwise encoding. > > There’s an interesting discussion of the rationale for C not having a > binary literal syntax here: > > http://stackoverflow.com/q/18244726 > > C++14 has one, though: > > https://en.wikipedia.org/wiki/C%2B%2B14#Binary_literals > > I don’t suppose Fossil could get away with using the nonstandard > extensions supported by GCC and Clang? That won’t cover native Windows, > but Visual C++ 2015 supports the C++14 syntax; I’ve tested it here, and > it’s accepted in C code, too. > ___ > fossil-users mailing list > fossil-users@lists.fossil-scm.org > http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Suggestions for list of branches in web-UI
On 24 May 2016, at 23:42 , Marko Käningwrote: > My initial point 1) is still open for discussion: > >> 1) It would be nice, if private branches were marked in /brlist somehow, >> preferably >> by introducing a new column entitled “Private”, or stg similar. > > Anyone out there having a comment for that? Greets, Marko ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Adding 'bz2' and 'bzip' to mimetypes
On Jun 9, 2016, at 9:04 AM, Tomek Kottwrote: > > Any chance that 'bz2' and 'bzip' could be added to the 'official' fossil > types? (http://fossil-scm.org/index.html/artifact/aad06cde815b7051?ln=71-288) If so, then xz should be added, too. It’s becoming quite popular. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] How to search for changes in checkins?
Hi Richard, On 09 Jun 2016, at 17:16 , Richard Hippwrote: > I've long thought that this would be a very useful feature to add. +1 > (1) A full-text index uses roughly 30% as much disk space as the > content that is being indexed. (This is a mathematical property of > full-text search - not a deficiency in Fossil's implementation.) > Doing full text search on all check-in diffs might result in a very > large search index. Well, 30% isn’t an issue in most cases, I figure. > (4) Initializing the search index means computing all historical > check-in diffs. That might take a while. Figure about 10 check-ins > per second. Fossil itself currently has 9506 check-ins, so total > initialization time would be 15 minutes. (The "10 checkins per > second" number is a guess. Processing might go significantly faster. > We will need to experiment to know.) One could make this indexing optional, perhaps, no? A bit like the indexing of Wiki/Docu/Tickets info in the UI… Greets, Marko ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] How to search for changes in checkins?
On Jun 9, 2016, at 8:40 AM, Andy Bradfordwrote: > > Hello, > > Is it possible to search for actual changes in checkins? For example, if > I have a repository that has a macro: > > #define MACRO { if (x) y; } > > Can I search Fossil for all checkins that included replacing code with > MACRO as a change? While short of the FTS solution proposed by drh, annotate/blame may be sufficient for your purposes. Just grep its output for MACRO. That will give you the checkin ID of the most recent change to that line, which in all likelihood will be the one where it replaced the prior program text. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Adding 'bz2' and 'bzip' to mimetypes
Now that we’re at it… On 09 Jun 2016, at 21:38 , Warren Youngwrote: > On Jun 9, 2016, at 9:04 AM, Tomek Kott wrote: >> >> Any chance that 'bz2' and 'bzip' could be added to the 'official' fossil >> types? (http://fossil-scm.org/index.html/artifact/aad06cde815b7051?ln=71-288) > > If so, then xz should be added, too. It’s becoming quite popular. … what about 7z? ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Pushing and pulling of ticket reports
Hi, On 30 May 2016, at 00:04 , Marko Käningwrote: > My question was where I find documented which areas are automatically > pushed/pulled > and whether one can make it clearer perhaps even in the “fossile config pull > help” > or/and something more appropriate. any feedback on this? Greets, Marko ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] When is 'tip' not 'trunk'
On Jun 9, 2016, at 12:23 PM, Clark Christensenwrote: > > > Is there a good reason you can’t at least upgrade to the latest point > > release, 1.34? > > I'm not sure what "the latest" offers that I can't live without. The brief changelog is here: https://www.fossil-scm.org/index.html/doc/trunk/www/changes.wiki Note that 1.35 changes appear there, but those are for the yet-to-be-released version. > there's been a lot of discussion here about the behavior of 'mv' and ‘rm' The specific changelog entry related to that discussion reads: • Add the --soft and --hard options to fossil rm and fossil mv. The default is still --soft, but that is now configurable at compile-time or by the mv-rm-files setting. So, no change as far as you are concerned. > Some want Fossil to modify the filesystem by default. I do not. I do, and for now, I must build Fossil with a configure script option that is disabled by default in order to make it do what I want. > I can't remember what the outcome is WRT to Fossil's behavior here. The last I heard, the behavior of that option will not change until Fossil 2.0, which as far as I can tell is still vaporware. And even when/if that default does change, I expect there will be a way to return Fossil to its prior behavior. In fact, this is signaled by the very name of the configure script option: --with-legacy-mv-rm. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
On Thu, Jun 9, 2016 at 2:12 PM, Warren Youngwrote: > On Jun 9, 2016, at 6:25 AM, rosscann...@fastmail.com wrote: > > > > The bug: > > In lookslike.c, invalid_utf8() returns 'invalid' for the input 0xE0, > > 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). > > I took a look at that code, and there is no possibility for it to be > correct. It doesn’t even try to consider 3- and 4- byte sequences. > It does consider 3 & 4 byte sequences in a round about way. I just committed a one line fix (with multiple lines of comments to clarify what the code is doing in the tricky part). -- Scott Robison ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] When is 'tip' not 'trunk'
> When the most recently modified branch is trunk, then tip and trunk > happen to be the same. Otherwise, they will point to different > checkins.Understood. Thanks for the quick answer. > Is there a good reason you can’t at least upgrade to the latest point > release, 1.34? Valid point. I'm not sure what "the latest" offers that I can't live without. Plus, there's been a lot of discussion here about the behavior of 'mv' and 'rm'. Some want Fossil to modify the filesystem by default. I do not. I can't remember what the outcome is WRT to Fossil's behavior here. So I'm lazy about adopting 'the latest' without spending the time to understand what changed. Especially when what I have continues to work reliably. -Clark From: Warren YoungTo: Fossil SCM user's discussion Sent: Wednesday, June 8, 2016 3:41 AM Subject: Re: [fossil-users] When is 'tip' not 'trunk' On Jun 7, 2016, at 3:24 PM, Clark Christensen wrote: > > Hit an unexpected thing today cloning a just-committed repo to a new computer. What do you get on the “tags:” line of the “fossil status” command’s output when run from the directory where you made that commit? If it is anything other than “trunk,” your results are exactly what you should expect. > But I don't understand why trunk and tip are not the same. If they were the same, we wouldn’t need two different terms, now would we? trunk is the name of a special branch that exists from the creation of the repository. If you make no branches, you will only have the trunk. tip is the most recent checkin on the most recently modified branch. When the most recently modified branch is trunk, then tip and trunk happen to be the same. Otherwise, they will point to different checkins. > Both clients (committer and cloner) have fossil version 1.32 [6c40678e91] > 2015-03-14 13:20:34 UTC on Windows 7x64 That’s getting a bit old now. Is there a good reason you can’t at least upgrade to the latest point release, 1.34? I often use old binary versions of Fossil myself…just long enough to clone the fossil-scm.org repo and build the current trunk version. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
On Jun 9, 2016, at 6:01 PM, Scott Robisonwrote: > > On Thu, Jun 9, 2016 at 2:12 PM, Warren Young wrote: > On Jun 9, 2016, at 6:25 AM, rosscann...@fastmail.com wrote: > > > > The bug: > > In lookslike.c, invalid_utf8() returns 'invalid' for the input 0xE0, > > 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). > > I took a look at that code, and there is no possibility for it to be correct. > It doesn’t even try to consider 3- and 4- byte sequences. > > It does consider 3 & 4 byte sequences in a round about way. I don’t see that it is checking that the top 2 bits of bytes 3 and 4 are 10, the only legal values. Without that, your tests cannot rule out some illegal values. That’s why I suggested that this be rewritten in binary. I’d be happier with something like this pseudocode: if (c[0] & 0b1000 == 0b && len(c) >= 4) { // check following 3 bytes for top bits == 10 c += 3; // don’t recheck them } else if (c[0] & 0b == 0b1110 && len(c) > 3) { // same as above, but “2” instead of “3” } // etc The corner cases like the 0x10 limit still need to be covered, of course, but only after the checker assures itself that it has a valid “raw UTF-8” value. (“Raw” meaning it passes the basic bit encoding patterns and is now being decoded to make sure it is also a legal Unicode value.) > I just committed a one line fix (with multiple lines of comments to clarify > what the code is doing in the tricky part). Thank you! Perhaps you will also extend Fossil’s test suite in this area. A bit of Googling turns up these UTF-8 test corpora: https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt http://www.columbia.edu/~fdc/utf8/ ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
Thanks! That patch works for me. Ross On Fri, Jun 10, 2016, at 10:01 AM, Scott Robison wrote: > On Thu, Jun 9, 2016 at 2:12 PM, Warren Youngwrote: >> On Jun 9, 2016, at 6:25 AM, rosscann...@fastmail.com wrote: >> > >> > The bug: >> > In lookslike.c, invalid_utf8() returns 'invalid' for the input >> > 0xE0, >> > 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). >> >> I took a look at that code, and there is no possibility for it to be >> correct. It doesn’t even try to consider 3- and 4- byte sequences. > > It does consider 3 & 4 byte sequences in a round about way. I just > committed a one line fix (with multiple lines of comments to clarify > what the code is doing in the tricky part). > > -- > Scott Robison > > _ > fossil-users mailing list > fossil-users@lists.fossil-scm.org > http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
On Thu, Jun 9, 2016 at 6:19 PM, Warren Youngwrote: > On Jun 9, 2016, at 6:01 PM, Scott Robison wrote: > > > > On Thu, Jun 9, 2016 at 2:12 PM, Warren Young wrote: > > On Jun 9, 2016, at 6:25 AM, rosscann...@fastmail.com wrote: > > > > > > The bug: > > > In lookslike.c, invalid_utf8() returns 'invalid' for the input 0xE0, > > > 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). > > > > I took a look at that code, and there is no possibility for it to be > correct. It doesn’t even try to consider 3- and 4- byte sequences. > > > > It does consider 3 & 4 byte sequences in a round about way. > > I don’t see that it is checking that the top 2 bits of bytes 3 and 4 are > 10, the only legal values. > Line 162 checks the current byte (c2) and the next byte (c) for validity. Assuming those checks pass, line 171 sets the next byte c to be a prefix byte for the next shorter sequence length ((c2 << 1) + 1) (or space if the valid two byte sequence passed). The next iteration of the loop uses the "faked" prefix byte and checks the next byte for validity. In this case, the old code took: > 0xE0 0xB8 0x94 Confirmed that c2==0xE0 was a valid prefix byte and that c==0xB8 was a valid next byte. Then, instead of keeping the value of c, it reassigns c to ((c2<<1)+1), or ((0xE0<<1)+1) == (0xC0+1) == 0xC1. The bug is that if a three byte sequence starts with 0xE0, transformed byte becomes an invalid too short two byte sequence. So my code checks for that edge case and changes the value to 0xC2. The next iteration of the loop checks the "forged but okay for our purposes) value of c2: > 0xC2 0x94 Confirms that c2==0xC2 is valid and c==0x94 is valid. Since this is a valid two byte sequence now, it sets the value of c to a space character, which is always valid utf-8. It's not intuitive, and I only discovered it after staring at the code for a while and playing computer with a pencil and paper. I didn't test every possible byte sequence, of course, but I handful I tried manually now decode correctly. The only problem I found was with three byte sequences that start with 0xE0. Perhaps you will also extend Fossil’s test suite in this area. A bit of > Googling turns up these UTF-8 test corpora: > > https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt > http://www.columbia.edu/~fdc/utf8/ I've never looked at the fossil test suite. I'll see what I can do. -- Scott Robison ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
On Jun 9, 2016, at 3:21 PM, rosscann...@fastmail.com wrote: > > - it's massive; It’s also open source under one of the most liberal licenses available. Fossil could just swipe the one header file it needs. It hasn’t changed since 2005, so one may presume that it is stable. > - it has dependencies between its modules, so even if you just want a > tiny part of it, you might need other parts; That header does include others, so yes, someone would have to work out whether you run into an untenable dependency chain. I haven’t looked deeply into it, but I suspect it could be boiled down to a single reasonably-small header file. > - it has a bewildering array of compiler options. Much of Boost is preprocessor- or template-only code, not requiring that you build the Boost libraries at all. I have personally never used any of the Boost compiled libraries, not wanting to distribute them as dependencies on systems that don’t include Boost in the OS’s package repo, some of which we still need to support. > One of the things I like about fossil is how quickly and easily it > builds! It would be a shame to change that. Agreed. But I think we’re only talking about adding one smallish header file here, not making all of Boost a prerequisite. > I like the idea of bit-manipulation macros, but they would be quite easy > to craft ad hoc. Patches thoughtfully considered. :) ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] how to report bug in fossil
rosscann...@fastmail.com wrote: > > The bug: > In lookslike.c, invalid_utf8() returns 'invalid' for the input 0xE0, > 0xB8, 0x94, which is the Thai character 'do dek' (U+0E14). This can be > easily reproduced by trying to commit a file that contains those three > bytes and nothing else - you will get the "this file contains invalid > UTF-8..." warning. > Thanks for the report. Jan, any hints on this one? -- Joe Mistachkin ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users