Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2012-02-02 Thread Julian Fagir
Hi,

 Compiling went fine, and after updating sqlite, it didn't even throw SQL
 errors anymore. ;-)
 So far, I can initialise a repository, add files, start a server, but
 whenever I want to commit, I get the message:
 fossil: manifest file (3) is malformed
I'm sorry I waited until now, but the problem persists, though there was a
change in 1.21 (at least there was an entry in the changelog).

 The second issue is about fossil 1.20: I configured it manually and built
 it, but whenever I want to commit with that version, it simply coredumps.
 The pkgsrc-1.18 does not patch anything, but just compile fossil, so I
 think this is rather a fossil issue.
1.21 compiled btw fine on my machine. :-)

Regards, Julian
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-14 Thread Stephan Beal
2011/11/13 Lluís Batlle i Rossell vi...@viric.name

 I agree with Julian. There should be an answer, if the letter does not
 reach the
 list. I also like when it is not required to subscribe to send mails.


In my experience, requiring subscription cuts down greatly on the amount of
noise and essentially blocks all (or 99% of) spam from mailing lists.


 I don't think the write is related to the SIGBUS.


i don't think it has anything directly to do with it, either. i suspect
it's just bad timing or possibly memory corruption caused by stack abuse at
some other point. That said, i think that any such bug is probably
compiler-specific, since none of us are seeing it on non-sparc platforms. i
will run push/pull through valgrind this evening, but i don't expect to see
anything more drastic than a couple of standard leaks we have in (e.g.)
the argument/parameter handling.


 If it is similar to 'strace',
 the = 512 means that the syscall succeeded and returned 512.


correct: 512 is the return value of the write() call, == the number of
bytes it was asked to write (512).

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


[fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Julian Fagir
Hi,

so I'm sending this mail the fourth time... are mails from outside the
mailing list dropped? I got so far no mail that mine is waiting for being
approved, and it was sent three days ago (g...@komkon2.de).

On my NetBSD sparc64 I installed fossil 1.18 (the most recent version on
pkgsrc):
NetBSD $HOSTNAME 5.99.55 NetBSD 5.99.55 (NETRAT1) #6: Thu Jul 28
15:14:56 CEST 2011 $BUILDCONFIG sparc64

Compiling went fine, and after updating sqlite, it didn't even throw SQL
errors anymore. ;-)
So far, I can initialise a repository, add files, start a server, but
whenever I want to commit, I get the message:
fossil: manifest file (3) is malformed

No matter how much I add or what it is. A typical log:
A typical log:
$ fossil init blakopf
project-id: 52d8d1d05ce6b474e88e19a19b8a9a54a800589a
server-id:  b374ce2c60b1f7849e050f724f7f37dbd046f98e
admin-user: gnrp (initial password is 115ecd)
$ fossil open blakopf
$ touch distinfo
$ fossil add distinfo
ADDED  distinfo
$ fossil commit -m toll
New_Version: 248065b1c79f682e5138ebccdaf2c6da0d4e9070
fossil: manifest file (3) is malformed
$

Then I installed the binary file for sparc (32-bit), and tested it as well on
my old SparcStation LX, and there it ran fine.
This works for now, but I'd rather like to have the right and self-compiled
version for my architecture.


The second issue is about fossil 1.20: I configured it manually and built it,
but whenever I want to commit with that version, it simply coredumps. The
pkgsrc-1.18 does not patch anything, but just compile fossil, so I think this
is rather a fossil issue.


I can of course provide traces of the calls, if you want.


Regards, Julian
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Stephan Beal
On Sun, Nov 13, 2011 at 8:53 PM, Julian Fagir listensamm...@komkon2.dewrote:

 so I'm sending this mail the fourth time... are mails from outside the
 mailing list dropped?


Your prior mails never made it to the list - this is the first one we've
gotten.



 Compiling went fine, and after updating sqlite, it didn't even throw SQL
 errors anymore. ;-)


Updating sqlite3? Are you linking it with a custom version instead of the
one which is part of the fossil distro?


 $ touch distinfo
 $ fossil add distinfo
 ADDED  distinfo
 $ fossil commit -m toll


is this also happening with non-empty files?


 The second issue is about fossil 1.20: I configured it manually and built
 it,
 but whenever I want to commit with that version, it simply coredumps. The
 ...
 I can of course provide traces of the calls, if you want.


Yes, please - without those we can't do more than speculate about what the
problem might be.

AFAIK none of the regulars on this list use fossil on sparc platforms
(especially BSD on sparc), so any details you can provide would be helpful.

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Lluís Batlle i Rossell
On Sun, Nov 13, 2011 at 08:53:40PM +0100, Julian Fagir wrote:
 Then I installed the binary file for sparc (32-bit), and tested it as well on
 my old SparcStation LX, and there it ran fine.
 This works for now, but I'd rather like to have the right and self-compiled
 version for my architecture.

Sure. Maybe you can debug this further?

For what it is worth, fossil runs quite fine for me on linux-armv5tel. I only
noticed some little troubles managing timestamps with floating point values;
maybe fpu emulation there gives somewhat different values than other FPUs.

I can try also on mips-n32; I still have not tried. Do you think it may be any
similar?

Regards,
Lluís.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Stephan Beal
On Sun, Nov 13, 2011 at 10:10 PM, Julian Fagir listensamm...@komkon2.dewrote:

 Having to register to send mails to the list (and especially getting no
 reports) is somewhat... unintuitive.


i can't even guess as to why the mailing list hasn't been working for you.


  Updating sqlite3? Are you linking it with a custom version instead of the
  one which is part of the fossil distro?
 I'm sorry I don't remember that anymore. It didn't work, then I updated
 sqlite3, then recompiled it, then it worked.


When you say updated sqlite3, what exactly do you mean? Fossil has its
own embedded copy. Did you replace that one (under src/sqlite3.*) with one
you got from somewhere else?

There are three files attached:
  * configure log of 1.20
  * ktruss of 1.20 coredumping


This part:

11868  1 fossil   write(0x5, 0x40b16808, 0x200) = 512, 1085368328

 
\0\0\0\0\0\0\0\0\0\0\0\0\M-/\M^H\M-66\0\0\0\0\0\0\^B\0\0\0\^D\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
 11868  1 fossil   SIGBUS SIG_DFL

As best as i can tell that's your OS segfaulting, not fossil. It returns
from a write() and then immediately throws a sigbus? sigbus is something
i've only seen on sparc, and i've personally only seen it when linking to
invalid system libs (e.g. those compiled for other platforms). A bit of
googling shows sigbus to sometimes be alignment-related. The address being
passed to write() is 0x40b16808 (dec=1085368328), which should be
properly aligned for 32/64-bit.

Can you run it through gdb (or the platform's equivalent) and give us a
backtrace after the crash?

Other googling suggests that a cast in one part of the code can cause
corruption which first shows up as a sigbus further downstream. Could you
run it through valgrind? That would help us rule out memory corruption.

 * ktruss of 1.18 complaining about manifest file


1.18 is ancient and won't tell us much - a lot has happened since then.

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Lluís Batlle i Rossell
On Sun, Nov 13, 2011 at 11:02:06PM +0100, Stephan Beal wrote:
 On Sun, Nov 13, 2011 at 10:10 PM, Julian Fagir 
 listensamm...@komkon2.dewrote:
  Having to register to send mails to the list (and especially getting no
  reports) is somewhat... unintuitive.
 
 
 i can't even guess as to why the mailing list hasn't been working for you.

I agree with Julian. There should be an answer, if the letter does not reach the
list. I also like when it is not required to subscribe to send mails.
 
 This part:
 
 11868  1 fossil   write(0x5, 0x40b16808, 0x200) = 512, 1085368328
 
  
 \0\0\0\0\0\0\0\0\0\0\0\0\M-/\M^H\M-66\0\0\0\0\0\0\^B\0\0\0\^D\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
  11868  1 fossil   SIGBUS SIG_DFL
 
 As best as i can tell that's your OS segfaulting, not fossil. It returns
 from a write() and then immediately throws a sigbus? sigbus is something
 i've only seen on sparc, and i've personally only seen it when linking to
 invalid system libs (e.g. those compiled for other platforms). A bit of
 googling shows sigbus to sometimes be alignment-related. The address being
 passed to write() is 0x40b16808 (dec=1085368328), which should be
 properly aligned for 32/64-bit.

I don't think the write is related to the SIGBUS. If it is similar to 'strace',
the = 512 means that the syscall succeeded and returned 512.

 Other googling suggests that a cast in one part of the code can cause
 corruption which first shows up as a sigbus further downstream. Could you
 run it through valgrind? That would help us rule out memory corruption.

I don't think valgrind runs on sparc.

gdb will catch the sigbus perfectly. It should be quite tricky C code for a
C compiler to generate bad-aligned accesses for a given platform. I'd like to
know where is that bad access; I've not checked, but I'd imagine fossil has no
tricky C code.

Regards,
Lluís.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Richard Hipp
2011/11/13 Lluís Batlle i Rossell vi...@viric.name

  It should be quite tricky C code for a
 C compiler to generate bad-aligned accesses for a given platform. I'd like
 to
 know where is that bad access; I've not checked, but I'd imagine fossil
 has no
 tricky C code.


Fossil doesn't have any tricky C code (or at least it shouldn't - if you
find some we'll call it a bug.)  But SQLite does definitely have tricky C
code.  We've had SIGBUS problems running SQLite on sparc before, but I
thought we had fixed all those.  On the other hand, I don't have sparc on
which to test SQLite so I'm not really sure.

I know we don't have alignment problems on PPC, but I don't know if PPC is
as fussy about alignment as Sparc is



 Regards,
 Lluís.
 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users




-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Julian Fagir
Hi,

please excuse my overlengthy lines, it was easier to not-wrap this content.

 As best as i can tell that's your OS segfaulting, not fossil. It returns
 from a write() and then immediately throws a sigbus? sigbus is something
 i've only seen on sparc, and i've personally only seen it when linking to
 invalid system libs (e.g. those compiled for other platforms). A bit of
 googling shows sigbus to sometimes be alignment-related. The address being
 passed to write() is 0x40b16808 (dec=1085368328), which should be
properly aligned for 32/64-bit.
Though I wouldn't bet on it, I think I've already seen SIGBUS on x86 with
FreeBSD.

   It should be quite tricky C code for a
  C compiler to generate bad-aligned accesses for a given platform. I'd like
  to
  know where is that bad access; I've not checked, but I'd imagine fossil
  has no
  tricky C code.
 
 
 Fossil doesn't have any tricky C code (or at least it shouldn't - if you
 find some we'll call it a bug.)  But SQLite does definitely have tricky C
 code.  We've had SIGBUS problems running SQLite on sparc before, but I
 thought we had fixed all those.  On the other hand, I don't have sparc on
 which to test SQLite so I'm not really sure.
You're right, seems to be sqlite.

$ gdb fossil-src-20111021125253/fossil 
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as sparc64--netbsd...
(gdb) run init blakopf
Starting program: /home/gnrp/temp/fossil-src-20111021125253/fossil init blakopf

Program received signal SIGBUS, Bus error.
0x001b7560 in sqlite3CreateIndex (pParse=0x40b03008, pName1=value 
optimized out, pName2=value optimized out, pTblName=0x0, pList=0x40b10288, 
onError=value optimized out, pStart=0x0, pEnd=0x0, sortOrder=0, 
ifNotExist=0) at ./src/sqlite3.c:82174
82174   pIndex-azColl[i] = zColl;
(gdb)


I used svn before which uses sqlite, too (at least it's the dependency), but it 
worked flawless.

Regards, Julian
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Joerg Sonnenberger
On Sun, Nov 13, 2011 at 05:15:59PM -0500, Richard Hipp wrote:
 2011/11/13 Lluís Batlle i Rossell vi...@viric.name
 
   It should be quite tricky C code for a
  C compiler to generate bad-aligned accesses for a given platform. I'd like
  to
  know where is that bad access; I've not checked, but I'd imagine fossil
  has no
  tricky C code.
 
 
 Fossil doesn't have any tricky C code (or at least it shouldn't - if you
 find some we'll call it a bug.)  But SQLite does definitely have tricky C
 code.  We've had SIGBUS problems running SQLite on sparc before, but I
 thought we had fixed all those.  On the other hand, I don't have sparc on
 which to test SQLite so I'm not really sure.

There are some candidates I can think of. The easiest is definitely to
just run gdb on a debug build and report the back trace of the crash.

 I know we don't have alignment problems on PPC, but I don't know if PPC is
 as fussy about alignment as Sparc is

It isn't. (Older) ARM and SPARC are the most interesting platforms when
it comes to strict alignment. PPC is normally configured to not trap,
just like x86.

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil coredumping and reporting malformed manifest on sparc64

2011-11-13 Thread Lluís Batlle i Rossell
On Sun, Nov 13, 2011 at 11:21:43PM +0100, Joerg Sonnenberger wrote:
 It isn't. (Older) ARM and SPARC are the most interesting platforms when
 it comes to strict alignment. PPC is normally configured to not trap,
 just like x86.

I've only had sigbus troubles on mips, but I could have had them in arm I think.
But if iirc, the linux arm has a sigbus catcher enabled by default that,
although slow, corrects bad aligned accesses. I imagine most picky architectures
have similar code, but not all of them have it enabled by default.

Does the sparc kernel you use have that option, Julian? It can help at least to
allow you running fossil.

Regards,
Lluís.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users