Re: fsid change of ZFS?
On Wed, Aug 24, 2011 at 04:41:25PM +0300, Kostik Belousov wrote: On Wed, Aug 24, 2011 at 09:36:37AM -0400, Rick Macklem wrote: Well, doesn't this result in the same issue as the fixed table? In other words, the developer has to supply the suggested byte for fsid and make sure that it doesn't conflict with other suggested byte values or suffer the same consequence as forgetting to update the fixed table. (ie. It just puts the fixed value in a different place, from what I see, for in-tree modules. Also, with a fixed table, they are all in one place, so it's easy to choose a non-colliding value?) The reason for my proposal was Pawel note that a porter of the filesystem should be aware of some place in kern/ where to register, besides writing the module. Well, he has to be aware, but we should do all we can to minimize the number of place he needs to update, as it is easy to forget some. I agree with Rick that what you proposed is similar to fixed table of file system names and I'd prefer to avoid that. If we can have name-based hash that produces no collision for in-tree file systems and know current 3rd party file systems plus collision detection for the future then it is good enough, IMHO. And this is what Rick proposed with his patch. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpOyu4gfRBq3.pgp Description: PGP signature
Re: fsid change of ZFS?
On Sat, Aug 20, 2011 at 08:15:34PM -0400, Rick Macklem wrote: Hiroki, could you please test the attached patch. One problem with this patch is that I don't know how to create a fixed table that matches what systems would already have been getting. (I got the first 6 entries by booting a GENERIC i386 kernel with a printf in vfs_init(), so I suspect those don't change much, although I'm not sure if ZFS will usually end up before or after them?) Do you guys know what ZFS gets assigned typically? (I realize that changes w.r.t. when it gets loaded, so the question also becomes do you know how it typically gets loaded so the table can have that vfc_typenum value assigned to it?) Maybe you could boot a system with a printf like: printf(%s, %d\n, vfc-vfc_name, vfc-vfc_typenum); just after vfc-vfc_typenum = maxvfsconf++; in vfs_init() and then look in dmesg after booting, to see what your tables look like? (Without the attached patch installed.) Rick, I'm sorry to arrive so late, but in my opinion hardcoding list of file systems in the kernel is a step in wrong direction, really. We are trying to keep things modularized, so there are no such things laying around that have to be cleaned up when file system goes away or updated when new file system arrives. I remember for example fts code where I found that it keeps list of file systems that can be handled faster. ZFS could have been handled faster, but I found this after few years. For this case there should be VFCF_* flag that fts shuld recognize and not hardcore file system names. This was also the reason that when I added support for jail-friendly file systems and support for file systems with delegated administration I haven't created list of file system types that support it, but added VFCF_JAIL and VFCF_DELEGADMIN flags. Here you cannot use those flags to solve the problem, but hardcoding file system types in an array is really not the way to go. I much prefer Ben's idea of calculating a hash from file system name and detecting collisions. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpJBiqW9g9eb.pgp Description: PGP signature
Re: fsid change of ZFS?
On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote: Ok, I'll admit I wasn't very fond of a fixed table that would inevitably get out of date someday, either. I didn't think hashing for the cases not in the table was worth the effort, but doing a hash instead of a table seems reasonable. I see that ZFS only uses the low order 8 bits, so I'll try and come up with an 8bit hash solution and will post a patch for testing/review soon. I don't think the vfs_sysctl() is that great a concern, given that it appears to be deprecated already anyhow. (With an 8bit hash, vfs_typenum won't be that sparse.) I'll also make sure that whatever hash I use doesn't collide for the current list of file names (although I will include code that handles a collision in the patch). Sounds great. Thanks! -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpDcE0skKKef.pgp Description: PGP signature
Re: fsid change of ZFS?
On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote: Pawel Jakub Dawidek wrote: On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote: Ok, I'll admit I wasn't very fond of a fixed table that would inevitably get out of date someday, either. I didn't think hashing for the cases not in the table was worth the effort, but doing a hash instead of a table seems reasonable. I see that ZFS only uses the low order 8 bits, so I'll try and come up with an 8bit hash solution and will post a patch for testing/review soon. I don't think the vfs_sysctl() is that great a concern, given that it appears to be deprecated already anyhow. (With an 8bit hash, vfs_typenum won't be that sparse.) I'll also make sure that whatever hash I use doesn't collide for the current list of file names (although I will include code that handles a collision in the patch). Sounds great. Thanks! Here's the patch. (Hiroki could you please test this, thanks, rick.) ps: If the white space gets trashed, the same patch is at: http://people.freebsd.org/~rmacklem/fsid.patch The patch is fine by me. Thanks, Rick! -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpsuJPptDK8Q.pgp Description: PGP signature
Re: Weird issue with hastd(8)
On Sat, Jun 25, 2011 at 05:54:13PM +0300, Mikolaj Golub wrote: For me the idea to send updates to secondary only via synchronization thread, starting it periodically looks interesting. Sure it should not be the replacement for real async mode, but having something like this in hast apart other synchronization modes might be useful. Comparing it with real async that is described in manual it has the following advantages: 1) It is much easier to implement. 2) If you have frequent updates of the same blocks, real async will send them all, while with sync thread approach we will skip many intermediate updates. I must say I don't agree with your points here. We should not implement one more replication mode, because it is easier to implement. Imagine situation when we finally get proper 'async' mode and we will need to explain to the users the difference between 'async' and 'async2' modes as async2 was easier to implement back when we had no async yet, but for you it does more or less the same. And we will need to keep support for both of them. If anything, I'd prefer to call it 'async' and then change underlying algorithm entirely. This will handle users confusion, but still leaves the need to protocol compatiblity between hastds implementing older and newer 'async'. The second argument reveals weakness of this approach. The very important thing is to keep data consistent when nodes are connected. By 'consistent' I mean that in every point in time if primary dies, secondary can start operating - it may have a bit older data in async mode, but the data will be consistent - you can fsck file system and start your services. In the way you described no care is taken to move the data to the secondary node in proper order, ie. some later writes can be send before earlier writes, because eg. they are placed in lower extent and if you have primary failure right there, the secondary data view won't be consistent and your file system will most likely by corrupt. In async mode you can skip and combine only consecutive writes. For example if your queue contains the following writes (number. offset size): 1.0 1024 2. 512 1024 3.0 1024 4. 4096 1024 5.0 1536 You can compress it to: 2+3.0 1536 4. 4096 1024 5.0 1536 Where we ignore first write entirely and combine writes 2 and 3, but we cannot simply skip first three writes, only because we have fifth write that covers them, as there is 4096,1024 request in between. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpayeHu7ykVa.pgp Description: PGP signature
Re: Randomization in hastd(8) synchronization thread
On Tue, May 17, 2011 at 12:39:19PM -0700, Maxim Sobolev wrote: Hi Pawel, I am trying to use hastd(8) over slow links and one problem is apparent right now - current approach with synchronizing content sequentially is not working in this case. What happens is that hastd hits the first frequently updated block and cannot make any progress anymore. In my case I have 30GB of dirty space to be synchronized over just 1mbps uplink. The quick fix that I've applied is randomization in the block selection code. This way eventually all least used blocks will be synchronized, leaving only hot ones dirty. More effective approach would be to use some kind of LRU selection algorithm, but statistical approach would work just as good in this case. Please review the patch below: http://sobomax.sippysoft.com/activemap.c.diff Hmm, hastd keeps separate bitmap for synchronization. It is stored in am_syncmap field. Blocks that are dirtied during regular writes should not effect on synchronization bitmap and synchronization progress. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgp9xz8wcUwuQ.pgp Description: PGP signature
Re: geli on r221012
On Mon, Apr 25, 2011 at 01:31:55PM +, Anton Yuzhaninov wrote: Geli no longer works for me after upgrade to r221012. # geli attach -k ~citrin/private.key /dev/label/spool2 Enter passphrase: # from dmesg: GEOM_ELI: Device label/spool2.eli created. GEOM_ELI: Encryption: Blowfish-CBC 128 GEOM_ELI: Integrity: HMAC/MD5 GEOM_ELI: Crypto: software # dd if=/dev/label/spool2.eli of=/dev/null dd: /dev/label/spool2.eli: Invalid argument 0+0 records in 0+0 records out 0 bytes transferred in 0.000669 secs (0 bytes/sec) Thanks for the report! It should be fixed in r221628. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpUlmhHjBPXE.pgp Description: PGP signature
Re: panic: g_eli_key_hold: sc_ekeys_total=1
On Sun, Apr 24, 2011 at 11:12:03AM +0200, Fabian Keil wrote: The panic can be reproduced with: /sbin/geli onetime -l 256 -s 4096 /dev/ada0s1b That's why I asked for ada0s1b size. It should be fixed in HEAD (r220984). -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpAuN5zvAX8k.pgp Description: PGP signature
Re: panic: g_eli_key_hold: sc_ekeys_total=1
On Fri, Apr 22, 2011 at 05:04:01PM +0200, Fabian Keil wrote: With sources from today my system panics at boot time after attaching the swap device: GEOM_ELI: Device ada0s1b.eli created. GEOM_ELI: Encryption: AES-XTS 256 GEOM_ELI: Crypto: software panic: g_eli_key_hold: sc_ekeys_total=1 cpuid = 0 KDB: enter: panic Uptime: 2m16s Physical memory: 1974 MB Dumping 213 MB: 198 182 166 150 134 118 102 86 70 54 38 22 6 [...] Could you provide the output of: # diskinfo -v /dev/ada0s1b And could you try: # /sbin/geli onetime -l 256 -s 4096 /dev/ada0s1b -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgp1PdPS9g7QC.pgp Description: PGP signature
Re: Any success stories for HAST + ZFS?
On Thu, Mar 24, 2011 at 01:36:32PM -0700, Freddie Cash wrote: [Not sure which list is most appropriate since it's using HAST + ZFS on -RELEASE, -STABLE, and -CURRENT. Feel free to trim the CC: on replies.] I'm having a hell of a time making this work on real hardware, and am not ruling out hardware issues as yet, but wanted to get some reassurance that someone out there is using this combination (FreeBSD + HAST + ZFS) successfully, without kernel panics, without core dumps, without deadlocks, without issues, etc. I need to know I'm not chasing a dead rabbit. I just committed a fix for a problem that might look like a deadlock. With trociny@ patch and my last fix (to GEOM GATE and hastd) do you still have any issues? -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpfaqPYEbyOO.pgp Description: PGP signature
Re: Any success stories for HAST + ZFS?
On Thu, Mar 24, 2011 at 01:36:32PM -0700, Freddie Cash wrote: I've tried with FreeBSD 8.2-RELEASE, 8-STABLE, 8-STABLE w/ZFSv28 patches, and 9-CURRENT (after the ZFSv28 commit). Things work well until I start hastd. Then either the system locks up, or hastd causes a kernel panic, or hastd dumps core. The minimum amount of information (as always) would be backtrace from the kernel and also hastd backtrace when it coredumps. There is really decent logging in hast, so I'm also sure it does log something interesting on primary or secondary. Another useful thing would be to turn on debugging in hast (single -d option for hastd). The best you can do is to give me the simplest and quickest procedure to reproduce the issue, eg. configure two hast resources, put ZFS mirror on top, start rsync /usr/src to the file system on top of hast and switch roles. The simpler the better. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpYcvgL105vI.pgp Description: PGP signature
Re: missing files in readdir(3) on NFS export of ZFS volume (since v28?)
On Mon, Mar 07, 2011 at 01:08:46AM +0100, Pierre Beyssac wrote: Hello, I'm running a 9-current server as compiled on Sat Mar 5 02:17:14 CET 2011. Since I upgraded to ZFS v28 I noticed missing files from NFS. The files are still accessible through NFS but they don't show up on a readdir(3). [...] Could you try r219404? -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpeiqDGOvkQL.pgp Description: PGP signature
Re: [head tinderbox] failure on ia64/ia64
On Mon, Mar 07, 2011 at 01:06:11AM +, FreeBSD Tinderbox wrote: TB --- 2011-03-07 00:25:55 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2011-03-07 00:25:55 - starting HEAD tinderbox run for ia64/ia64 TB --- 2011-03-07 00:25:55 - cleaning the object tree TB --- 2011-03-07 00:26:06 - cvsupping the source tree TB --- 2011-03-07 00:26:06 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/HEAD/ia64/ia64/supfile TB --- 2011-03-07 00:26:19 - building world TB --- 2011-03-07 00:26:19 - MAKEOBJDIRPREFIX=/obj TB --- 2011-03-07 00:26:19 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2011-03-07 00:26:19 - TARGET=ia64 TB --- 2011-03-07 00:26:19 - TARGET_ARCH=ia64 TB --- 2011-03-07 00:26:19 - TZ=UTC TB --- 2011-03-07 00:26:19 - __MAKE_CONF=/dev/null TB --- 2011-03-07 00:26:19 - cd /src TB --- 2011-03-07 00:26:19 - /usr/bin/make -B buildworld World build started on Mon Mar 7 00:26:20 UTC 2011 Rebuilding the temporary build tree stage 1.1: legacy release compatibility shims stage 1.2: bootstrap tools stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3: cross tools stage 4.1: building includes stage 4.2: building libraries stage 4.3: make dependencies [...] mkdep -f .depend -a /src/sbin/growfs/growfs.c echo growfs: /obj/ia64.ia64/src/tmp/usr/lib/libc.a .depend === sbin/gvinum (depend) rm -f .depend mkdep -f .depend -a-I/src/sbin/gvinum/../../sys /src/sbin/gvinum/gvinum.c /src/sbin/gvinum/../../sys/geom/vinum/geom_vinum_share.c echo gvinum: /obj/ia64.ia64/src/tmp/usr/lib/libc.a /obj/ia64.ia64/src/tmp/usr/lib/libreadline.a /obj/ia64.ia64/src/tmp/usr/lib/libtermcap.a /obj/ia64.ia64/src/tmp/usr/lib/libdevstat.a /obj/ia64.ia64/src/tmp/usr/lib/libkvm.a /obj/ia64.ia64/src/tmp/usr/lib/libgeom.a .depend === sbin/hastctl (depend) make: don't know how to make hast_compression.c. Stop *** Error code 2 Interesting race. hast_compression.c was added in the same commit it was added to hastctl Makefile. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpFDhVqWe1wK.pgp Description: PGP signature
Re: HEADS UP: ZFSv28 is in!
On Mon, Feb 28, 2011 at 08:34:08AM +0100, Martin Sugioarto wrote: PS. If you like my work, you help me to promote yomoli.com:) http://yomoli.com http://www.facebook.com/pages/Yomolicom/178311095544155 I would like, but you should at least tell me what it is (what will be sold there). I don't like to advertise things I don't know or even things that seem evil to me. I'll post your answer to a well-known German *BSD forum, if you want. Well, I didn't want to say too much about it here, as it isn't really related to FreeBSD. This is a startup I'm working on which is location-based chat, which allows users to communicate with their neighborhood. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpe1gJOLMeSe.pgp Description: PGP signature
Re: HEADS UP: ZFSv28 is in!
On Sun, Feb 27, 2011 at 04:03:01PM -0700, Shawn Webb wrote: I'm so excited for your work. Thanks so much for bringing zpool v28 to FreeBSD. Will v28 come to 8-stable? Yes, hopefully in 1-2 month(s). -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgp1UOEA9rzOR.pgp Description: PGP signature
Re: HEADS UP: ZFSv28 is in!
On Mon, Feb 28, 2011 at 10:37:25AM +, krad wrote: On 28 February 2011 08:47, Pawel Jakub Dawidek p...@freebsd.org wrote: On Sun, Feb 27, 2011 at 04:03:01PM -0700, Shawn Webb wrote: I'm so excited for your work. Thanks so much for bringing zpool v28 to FreeBSD. Will v28 come to 8-stable? Yes, hopefully in 1-2 month(s). -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com ive never managed to be able to boot off my 4k aligned pool (ashift=12) on stable, does the import to head provide all the patches for this or is it a case of using the latest zfs v28 patch set for stable? I have no dying need for v28 yet, it just want to be able to boot onto the 4k drive and tidy things up. Support for this is included in what I committed to HEAD. Even HEAD couldn't boot off of pools with ashift != 9 until now. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpoBcg2ska7K.pgp Description: PGP signature
HEADS UP: ZFSv28 is in!
Hi. I just committed ZFSv28 to HEAD. New major features: - Data deduplication. - Triple parity RAIDZ (RAIDZ3). - zfs diff. - zpool split. - Snapshot holds. - zpool import -F. Allows to rewind corrupted pool to earlier transaction group. - Possibility to import pool in read-only mode. PS. If you like my work, you help me to promote yomoli.com:) http://yomoli.com http://www.facebook.com/pages/Yomolicom/178311095544155 -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpGTPfcT34QE.pgp Description: PGP signature
Re: [PATCH] OpenSolaris/ZFS: C++ compatibility
On Fri, Feb 04, 2011 at 11:03:53AM -0700, Justin T. Gibbs wrote: The attached patch is sufficient to allow a C++ program to use libzfs. The motivation for these changes is work I'm doing on a ZFS fault handling daemon that is written in C++. SpectraLogic's intention is to return this work to the FreeBSD project once it is a bit more complete. Since these changes modify files that come from OpenSolaris, I want to be sure I understand the project's policies regarding divergence from the vendor before I check them in. All of the changes save one should be trivial to merge with vendor changes and I will do that work for the v28 import. Is there any reason I should not commit these changes? Now that OpenSolaris is dead we don't have to be so strict with keeping the diff against vendor small at all cost. I'd prefer not to modify vendor code whenever possible so it is easier for us to cooperate with IllumOS (we already took ome code from them). Me and my company are also interested in fault management daemon (although not restricted to ZFS, but a more general purpose mechanism like FMA in Solaris). My question would be are there any chances you may be convinced to use plain C? With C we might be able to help, but not with C++. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgphkmODt5wu8.pgp Description: PGP signature
Re: [PATCH] OpenSolaris/ZFS: C++ compatibility
On Sat, Feb 05, 2011 at 02:36:40PM -0700, Justin T. Gibbs wrote: On 2/5/2011 8:39 AM, Pawel Jakub Dawidek wrote: On Fri, Feb 04, 2011 at 11:03:53AM -0700, Justin T. Gibbs wrote: The attached patch is sufficient to allow a C++ program to use libzfs. The motivation for these changes is work I'm doing on a ZFS fault handling daemon that is written in C++. SpectraLogic's intention is to return this work to the FreeBSD project once it is a bit more complete. Since these changes modify files that come from OpenSolaris, I want to be sure I understand the project's policies regarding divergence from the vendor before I check them in. All of the changes save one should be trivial to merge with vendor changes and I will do that work for the v28 import. Is there any reason I should not commit these changes? Now that OpenSolaris is dead we don't have to be so strict with keeping the diff against vendor small at all cost. I'd prefer not to modify vendor code whenever possible so it is easier for us to cooperate with IllumOS (we already took ome code from them). Perhaps IllumOS will accept these changes back? As I mentioned in the change descriptions included with the patch, the header files already show the intention of providing C++ support (extern C blocks), they just don't quite deliver. The changes shouldn't be controversial. Sure. To be clear: I'm not against those changes, I think they are worth it. And getting IllumOS to accept them back is definitely a good idea. Me and my company are also interested in fault management daemon (although not restricted to ZFS, but a more general purpose mechanism like FMA in Solaris). We have talked internally about this at Spectra too. Since we don't have BSD licensed nvpair code, we've thought of using Google protocol buffers to allow extensible encoding of fault data. The GP implementation is MIT licensed and looks like it might be less cumbersome to use than nvpairs. For the first release of our product, however, we are just making due with the string data that devctl provides. I've developed similar API during HAST work, maybe it is a good starting point? src/sbin/hastd/nv.{c,h}. My question would be are there any chances you may be convinced to use plain C? With C we might be able to help, but not with C++. The core FMA support needs to be reasonably accessible from C code of course (fully functional and not cumbersome to use). But we should allow FMA agents to be coded in whatever language is convenient to the developer. The project may only be able to accept agents in C (and I'm voting for C++ too) into it's distribution, but that policy should not drive us to make the FMA architecture hard to access from shell, python, ruby, or some other language. Yes, agents should not be limited to one language. I wouldn't be surprised is the majority of agents will be shell scripts. The reason I chose C++ for this task is that devd, the source of the events I process, already requires C++ so using C++ in zfsd doesn't impose any new requirements on the system. Zfsd, like even the C kernel of FreeBSD is coded in an object oriented fashion, but its much cleaner to implement this type of design in a language that inherently supports object oriented concepts. Could I rewrite all that I have in C? Sure, but there would have to be some compelling reasons to offset the reduction in clarity and maintainability such a change would cause. Hmm, so zfsd will receive events from devd? I'm in opinion that we should let devd alone. In my initial port I used devd, because it was closest match, but if we want to clean it up, we shouldn't go through devd. For example ZFS v28 can report whole binary blocks where checksum doesn't match and passing those through devd would be cumbersome. Is your inability to help on a C++ version of this code due to distaste for C++ or just a lack of experience with it? The latter. I'm sure there are many committers that are fluent in C++, but all of them know C. I was under impression that Warner implemented devd in C++ also as a kind of experiment, which nobody really followed. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpQQMrZ5Hwdv.pgp Description: PGP signature
Re: Replacing a failed disk in raidz2 zfs (and gpt)
On Thu, Feb 03, 2011 at 06:11:34AM +, Philip M. Gollucci wrote: All, I have a zroot(mirror)+zmysql(raidz2) setup on a MySQL db box. One drive failed (mfid3). We've since replaced it. I can't for the life of me get zpool to replace it. I can't remember why I used gpt instead of direct disks for the zmysql pool (but thats how it is). I've tried all of the following commands with different errors, and I must say I'm stumped. I've done this several times before for the ASF (but no gpt at play there). $ zpool scrub zmysql just runs, and completes, no error $ zpool replace zmysql gpt/disk3 cannot replace gpt/disk3 with gpt/disk3: one or more devices is currently unavailable [...] $ zpool offline zmysql gpt/disk3 cannot offline gpt/disk3: no valid replicas I'm afraid this is ZFS bug that is fixed in v28 for sure, not sure about v14/v15. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpvKfbSsGxHk.pgp Description: PGP signature
Re: Replacing a failed disk in raidz2 zfs (and gpt)
On Thu, Feb 03, 2011 at 07:52:52PM +, Philip M. Gollucci wrote: Do you have a bug ID ? I think it is 6328632. Change 5a60f16123ba. Note, there are many, many other unrelated changes. Do you have any work arounds? From what I can see, this change is in HEAD already, so I'll try that. Will a reboot help ? No idea, sorry. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpEXAC6VatmN.pgp Description: PGP signature
Re: Replacing a failed disk in raidz2 zfs (and gpt)
On Thu, Feb 03, 2011 at 08:08:15PM +, Philip M. Gollucci wrote: On 02/03/11 20:02, Pawel Jakub Dawidek wrote: On Thu, Feb 03, 2011 at 07:52:52PM +, Philip M. Gollucci wrote: Do you have a bug ID ? I think it is 6328632. Change 5a60f16123ba. Note, there are many, many other unrelated changes. Do you have any work arounds? From what I can see, this change is in HEAD already, so I'll try that. Do you have a pointer to how to get the hg repo handy. There's no diff there. The repo is still online: ssh://a...@hg.opensolaris.org/hg/onnv/onnv-gate But if you are thinking about extracting only part of the change responsible for your problem that might not be easy. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpmkyX9M3bLW.pgp Description: PGP signature
Re: [head tinderbox] failure on ia64/ia64
On Mon, Jan 31, 2011 at 04:56:06PM -0800, Marcel Moolenaar wrote: On Jan 31, 2011, at 3:51 PM, Pawel Jakub Dawidek wrote: On Mon, Jan 31, 2011 at 10:56:18PM +, FreeBSD Tinderbox wrote: [...] cc -O2 -pipe -I/src/sbin/hastctl/../hastd -DINET -DINET6 -DYY_NO_UNPUT -DYY_NO_INPUT -DHAVE_CRYPTO -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wold-style-definition -Wno-pointer-sign -c /src/sbin/hastctl/../hastd/proto_common.c cc1: warnings being treated as errors /src/sbin/hastctl/../hastd/proto_common.c: In function 'proto_common_descriptor_send': /src/sbin/hastctl/../hastd/proto_common.c:116: warning: cast increases required alignment of target type /src/sbin/hastctl/../hastd/proto_common.c: In function 'proto_common_descriptor_recv': /src/sbin/hastctl/../hastd/proto_common.c:146: warning: cast increases required alignment of target type /src/sbin/hastctl/../hastd/proto_common.c:149: warning: cast increases required alignment of target type *** Error code 1 Marcel, do you have an idea how one can use CMSG_NXTHDR() on ia64 with high WARNS? With WARNS=6 I get those errors and I've no idea how to fix it properly. If there is a fix, CMSG_NXTHDR() should probably be fixed, but maybe I'm wrong? this warning indicates that you're casting from a pointer to type P (P having alignment constraints Ap) to a pointer to type Q (Q having alignment constraints Aq), and Aq Ap. The compiler tells you that you may end up with misaligned accesses. If you know that the pointer satisfies Aq, you can cast through (void *) to silence the compiler. If you cannot guarantee that, you have a bigger problem. Solutions include packing type Q to reduce Aq or to copy the data to a local variable. Take the statement at line 116 for example: *((int *)CMSG_DATA(cmsg)) = fd; We're effectively casting from a (char *) to a (int *) and then doing a 32-bit access (write). The easy fix (casting through (void *) is not possible, because you cannot guarantee that the address is properly aligned. cmsg points to memory set aside by the following local variable: unsigned char ctrl[CMSG_SPACE(sizeof(fd))]; There's no guarantee that the compiler will align the character array at a 32-bit boundary (though in practice it seems to be). I have seen this kind of construct fail on ARM and PowerPC for example. In any case: The safest approach here is to use le32enc or be32enc rather than casting through (void *). Obviously these function encode using a fixed byte order when the original code is using the native byte order of the CPU. Having native encoding functions help. You could use bcopy as well, but the compiler is typically too smart for its own good and it will try to optimize the call away. This leaves you with the same misaligned access that you tried to avooid by using bcopy(). You need to trick the compiler so that it won't optimize the bcopy away, like: bcopy((void *)fd, CMSG_DATA(cmsg), sizeof(fd)); Interesting. I did use bcopy() to silence the warning, but the need to cast to (void *) is surprising. Still, I'm more concerned with CMSG_NXTHDR() macro, which from what I see might not be fixed by casting arguments. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpEWqZfPoVvr.pgp Description: PGP signature
Re: [head tinderbox] failure on ia64/ia64
On Mon, Jan 31, 2011 at 10:56:18PM +, FreeBSD Tinderbox wrote: [...] cc -O2 -pipe -I/src/sbin/hastctl/../hastd -DINET -DINET6 -DYY_NO_UNPUT -DYY_NO_INPUT -DHAVE_CRYPTO -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wold-style-definition -Wno-pointer-sign -c /src/sbin/hastctl/../hastd/proto_common.c cc1: warnings being treated as errors /src/sbin/hastctl/../hastd/proto_common.c: In function 'proto_common_descriptor_send': /src/sbin/hastctl/../hastd/proto_common.c:116: warning: cast increases required alignment of target type /src/sbin/hastctl/../hastd/proto_common.c: In function 'proto_common_descriptor_recv': /src/sbin/hastctl/../hastd/proto_common.c:146: warning: cast increases required alignment of target type /src/sbin/hastctl/../hastd/proto_common.c:149: warning: cast increases required alignment of target type *** Error code 1 Marcel, do you have an idea how one can use CMSG_NXTHDR() on ia64 with high WARNS? With WARNS=6 I get those errors and I've no idea how to fix it properly. If there is a fix, CMSG_NXTHDR() should probably be fixed, but maybe I'm wrong? -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgphFUx7Q3q4K.pgp Description: PGP signature
Re: My ZFS v28 Testing Experience
On Wed, Jan 12, 2011 at 11:03:19PM -0400, Chris Forgeron wrote: I've been testing out the v28 patch code for a month now, and I've yet to report any real issues other than what is mentioned below. I'll detail some of the things I've tested, hopefully the stability of v28 in FreeBSD will convince others to give it a try so the final release of v28 will be as solid as possible. I've been using FreeBSD 9.0-CURRENT as of Dec 12th, and 8.2PRE as of Dec 16th What's worked well: - I've made and destroyed small raidz's (3-5 disks), large 26 disk raid-10's, and a large 20 disk raid-50. - I've upgraded from v15, zfs 4, no issues on the different arrays noted above - I've confirmed that a v15 or v28 pool will import into Solaris 11 Express, and vice versa, with the exception about dual log or cache devices noted below. - I've run many TB of data through the ZFS storage via benchmarks from my VM's connected via NFS, to simple copies inside the same pool, or copies from one pool to another. - I've tested pretty much every compression level, and changing them as I tweak my setup and try to find the best blend. - I've added and subtracted many a log and cache device, some in failed states from hot-removals, and the pools always stayed intact. Thank you very much for all your testing, that's really a valuable contribution. I'll be happy to work with you on tracking down the bottleneck in ZFSv28. Issues: - Import of pools with multiple cache or log devices. (May be a very minor point) A v28 pool created in Solaris 11 Express with 2 or more log devices, or 2 or more cache devices won't import in FreeBSD 9. This also applies to a pool that is created in FreeBSD, is imported in Solaris to have the 2 log devices added there, then exported and attempted to be imported back in FreeBSD. No errors, zpool import just hangs forever. If I reboot into Solaris, import the pool, remove the dual devices, then reboot into FreeBSD, I can then import the pool without issue. A single cache, or log device will import just fine. Unfortunately I deleted my witness-enabled FreeBSD-9 drive, so I can't easily fire it back up to give more debug info. I'm hoping some kind soul will attempt this type of transaction and report more detail to the list. Note - I just decided to try adding 2 cache devices to a raidz pool in FreeBSD, export, and then importing, all without rebooting. That seems to work. BUT - As soon as you try to reboot FreeBSD with this pool staying active, it hangs on boot. Booting into Solaris, removing the 2 cache devices, then booting back into FreeBSD then works. Something is kept in memory between exporting then importing that allows this to work. Unfortunately I'm unable to reproduce this. It works for me with 2 cache and 2 log vdevs. I tried to reboot, etc. My test exactly looks like this: # zpool create tank raidz ada0 ada1 # zpool add tank cache ada0 ada1 # zpool export tank # kldunload zfs # zpool import tank works # reboot works - Speed. (More of an issue, but what do we do?) Wow, it's much slower than Solaris 11 Express for transactions. I do understand that Solaris will have a slight advantage over any port of ZFS. All of my speed tests are made with a kernel without debug, and yes, these are -CURRENT and -PRE releases, but the speed difference is very large. Before we go any further could you please confirm that you commented out this line in sys/modules/zfs/Makefile: CFLAGS+=-DDEBUG=1 This turns all kind of ZFS debugging and slows it down a lot, but for the correctness testing is invaluable. This will be turned off once we import ZFS into FreeBSD-CURRENT. BTW. In my testing Solaris 11 Express is much, much slower than FreeBSD/ZFSv28. And by much I mean two or more times in some tests. I was wondering if they have some debug turned on in Express. At first, I thought it may be more of an issue with the ix0/Intel X520DA2 10Gbe drivers that I'm using, since the bulk of my tests are over NFS (I'm going to use this as a SAN via NFS, so I test in that environment). But - I did a raw cp command from one pool to another of several TB. I executed the same command under FreeBSD as I did under Solaris 11 Express. When executed in FreeBSD, the copy took 36 hours. With a fresh destination pool of the same settings/compression/etc under Solaris, the copy took 7.5 hours. When you turn off compression (because it turns all-zero blocks into holes) you can test it by simply: # dd if=/dev/zero of=/zfs_fs/zero bs=1m -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgprFLLYTe9F4.pgp Description: PGP signature
Re: Next ZFSv28 patchset ready for testing.
On Wed, Dec 15, 2010 at 10:15:40AM +0200, Andrei Kolu wrote: 2010/12/14 Pawel Jakub Dawidek p...@freebsd.org On Mon, Dec 13, 2010 at 10:45:56PM +0100, Pawel Jakub Dawidek wrote: Hi. The new patchset is ready for testing: http://people.freebsd.org/~pjd/patches/zfs_20101212.patch.bz2 You can also download the whole source tree already patched from here: http://people.freebsd.org/~pjd/zfs_20101212.tbz # uname -a FreeBSD freebsd9.raidon.eu 9.0-CURRENT FreeBSD 9.0-CURRENT #0: Tue Dec 14 14:37:01 EET 2010 r...@freebsd9.raidon.eu:/usr/obj/usr/src/sys/GENERIC amd64 Create files filled with zeroes: # mkfile 512m disk1 disk2 disk3 disk4 # zpool create andmed raidz /home/antik/disk{1,2,3,4} # zpool status andmed pool: andmed state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM andmed ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 /home/antik/disk1 ONLINE 0 0 0 /home/antik/disk2 ONLINE 0 0 0 /home/antik/disk3 ONLINE 0 0 0 /home/antik/disk4 ONLINE 0 0 0 errors: No known data errors Now let's try to scrub: # zpool scrub andmed Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x1fb8007b fault code = supervisor read data, page not present instruction pointer = 0x20:0x812967d2 stack pointer = 0x20:0xff80ee605548 frame pointer = 0x28:0xff80ee605730 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2081 (initial thread) [ thread pid 2081 tid 100121 ] Stopped at vdev_file_open+0x92: testb $0x20,0x7b(%rax) Could you verify if this patch fixes the problem for you? http://people.freebsd.org/~pjd/patches/vdev_file.c.2.patch -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgplp1JmNuuvJ.pgp Description: PGP signature
Re: Next ZFSv28 patchset ready for testing.
On Fri, Dec 17, 2010 at 12:54:36AM +0300, Rechistov Grigory (Речистов Григорий) wrote: I started to check the new ZFS version inside a VirtualBox machine. So far it works for me without crashes, but I got some observations worth mentioning. Here are the steps I made: 1. Installed 8.1-RELEASE (from minimal install CD) 2. Csup'ped sources to CURRENT (as of 14/12/2010) [note that I haven't used SVN repository] 3. Applied the patch in question. 4. Created a zpool raidz of two disks of old version 15. Also some usual tuning of ZFS in loader.conf was done as I am running 32 bit version with low amount of memory. zfs_enable=YES in rc.conf was added too. 4.1 Moved /usr/ports to ZFS to have some files on it. 5. Make buildworld, buildkernel, installkernel, installworld - all the canonical steps from the Handbook. 6. After reboot to final 9.0-CURRENT world I got a dmesg with some trace stack related to ZFS and also a rc.d script message about unrecognized command 'volinit' (see the text of it in attachment). This one is because mergemaster(8) skips files with the same $FreeBSD$ value, so you need to copy /usr/src/etc/rc.d/zvol to /etc/rc.d/ by hand. 7. Nevertheless the system booted. Files 8. `zpool upgrade -a` worked all right and reported that now I have ZFS version 28 Overall I am pleasantly surprised how streamlined the whole process was. That's good to hear, thanks. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp5O7SANNIX6.pgp Description: PGP signature
Re: Next ZFSv28 patchset ready for testing.
On Wed, Dec 15, 2010 at 10:15:00PM -0500, ben wilber wrote: On Mon, Dec 13, 2010 at 10:45:56PM +0100, Pawel Jakub Dawidek wrote: Hi. The new patchset is ready for testing: Running fine for 24 hours now under load with a ~50 disk v15 (not upgraded) pool from -CURRENT. Thanks! Only strange thing is the rc script complains: /etc/rc: DEBUG: run_rc_command: doit: zvol_start unrecognized command 'volinit' usage: zfs command args ... Did you run mergemaster(8) after the upgrade? The patch includes change to etc/rc.d/zvol to remove 'zfs volinit'/'zfs volfini' which are no longer available. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp7c4gzudIbP.pgp Description: PGP signature
Re: Next ZFSv28 patchset ready for testing.
On Tue, Dec 14, 2010 at 03:20:05PM +0100, Olivier Smedts wrote: make installworld That's what I wanted to do, and why I rebooted single-user on the new kernel. But isn't the v13-v15 userland supposed to work with the v28 kernel ? Yes, it is suppose to work. Exactly to be able to follow FreeBSD common upgrade path. Martin was working on this (CCed). -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpCsgsK8Mp9u.pgp Description: PGP signature
Next ZFSv28 patchset ready for testing.
Hi. The new patchset is ready for testing: http://people.freebsd.org/~pjd/patches/zfs_20101212.patch.bz2 When applying the patch be sure to use correct options for patch(1)!: # cd /usr/src # fetch http://people.freebsd.org/~pjd/patches/zfs_20101212.patch.bz2 # bzip2 -d zfs_20101212.patch.bz2 # patch -E -p0 zfs_20101212.patch The patch is against FreeBSD HEAD as of 2010-12-12. Some of the changes since the last patchset (zfs_20100831.patch): - Boot support for ZFS v28 (only RAIDZ3 is not yet supported). - Various fixes for the existing ZFS boot code. - Support for sendfile(2) (by avg@). - Userland-kernel compatibility with v13-v15 (by mm@). - ACL fixes (by trasz@). - Various bug fixes. Please test, test, test. Chances are this is the last patchset before v28 going to HEAD (finally). Especially test new changes, like boot support and sendfile(2) support. Also be sure to verify if you can import for existing ZFS pools (v13-v15) when running v28 or boot from your existing pools. Enjoy! PS. Martin (mm@) will be providing patch against 8-STABLE soon. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgptzjMdmsjno.pgp Description: PGP signature
Re: Next ZFSv28 patchset ready for testing.
On Mon, Dec 13, 2010 at 10:45:56PM +0100, Pawel Jakub Dawidek wrote: Hi. The new patchset is ready for testing: http://people.freebsd.org/~pjd/patches/zfs_20101212.patch.bz2 When applying the patch be sure to use correct options for patch(1)!: # cd /usr/src # fetch http://people.freebsd.org/~pjd/patches/zfs_20101212.patch.bz2 # bzip2 -d zfs_20101212.patch.bz2 # patch -E -p0 zfs_20101212.patch [...] If patch(1) reports reject of sys/cddl/compat/opensolaris/sys/sysmacros.h file or you see the following error while compiling world: /usr/src/cddl/usr.bin/ctfconvert/../../../cddl/contrib/opensolaris/tools/ctf/cvt/strtab.c:249: undefined reference to `MIN' strtab.o(.text+0x28d): In function `strtab_insert': /usr/src/cddl/usr.bin/ctfconvert/../../../cddl/contrib/opensolaris/tools/ctf/cvt/strtab.c:119: undefined reference to `MIN' strtab.o(.text+0x3a1):/usr/src/cddl/usr.bin/ctfconvert/../../../cddl/contrib/opensolaris/tools/ctf/cvt/strtab.c:145: undefined reference to `MIN' *** Error code 1 Simple remove sys/cddl/compat/opensolaris/sys/sysmacros.h file from the tree. Unfortunately the patch can either works on source downloaded via cvsup or on the source downloaded via subversion as those two have different $FreeBSD$ id strings (at least in case of this file). The patch is generated based on subversion source, so if you use cvsup, you most likely will see the reject and the error. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp46myIfopSX.pgp Description: PGP signature
Re: Next ZFSv28 patchset ready for testing.
On Mon, Dec 13, 2010 at 10:45:56PM +0100, Pawel Jakub Dawidek wrote: Hi. The new patchset is ready for testing: http://people.freebsd.org/~pjd/patches/zfs_20101212.patch.bz2 You can also download the whole source tree already patched from here: http://people.freebsd.org/~pjd/zfs_20101212.tbz -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpJ41aQDwAYd.pgp Description: PGP signature
Re: Next ZFSv28 patchset ready for testing.
On Mon, Dec 13, 2010 at 11:00:31PM -, Steven Hartland wrote: What's the expected behaviour for the sendfile changes as sendfile is one of the problems we have here with the double memory allocation required for it under ZFS compared to UFS. Does this patch address that? No. The patch doesn't address that. It only adds support for sendfile(2), as it was commented out in the previous patchset. Inspecting the patch the following segment looks odd:- --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c.orig +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ... while (n 0) { nbytes = MIN(n, zfs_read_chunk_size - P2PHASE(uio-uio_loffset, zfs_read_chunk_size)); +#ifdef __FreeBSD__ + if (uio-uio_segflg == UIO_NOCOPY) + error = mappedread_sf(vp, nbytes, uio); + else +#endif /* __FreeBSD__ */ if (vn_has_cached_data(vp)) error = mappedread(vp, nbytes, uio); else Is there an extra else in there which will break things or should the __FreeBSD__ mappedread_sf block replace the standard mappedread call or is the indentation just a bit weird? The code is correct. It is just hard to split 'else' and 'if' with a '#endif' and keep the indentation pretty. Depends on the conditions we use one of the three methods to read the data. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpSKGrAP0AYX.pgp Description: PGP signature
Re: taskqueue_create() name parameter lieftime
On Tue, Nov 16, 2010 at 08:27:11AM -0500, John Baldwin wrote: On Tuesday, November 16, 2010 7:20:47 am Andriy Gapon wrote: taskqueue_create() documentation never explicitly says this, but current taskqueue_create() implementation just stores a 'name' pointer parameter internally. Thus it depends on the 'name' having a life time encompassing that of the taskqueue. I think that alternatively we could have copied the name (or a portion of it) into an internal buffer. I don't any argument for either approach, just curious which one looks more preferable from general (FreeBSD, kernel) programming practices point of view. Hmm, in many other places we store a separate copy (e.g. all the interrupt code uses separate MAXCOMLEN char arrays to hold names). If that is easy to do, that is probably the best approach. The most friendly API would keep the name internally, but would also allow me to provide name in printf-like format, so I don't have to use sprint()/snprintf() before calling it. This unfortunatelly will change taskqueue API as name is the first argument, which makes it not worth the pain. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp3yVgaHDkwq.pgp Description: PGP signature
Re: ZFS v28 is ready for wider testing.
On Wed, Nov 03, 2010 at 07:28:15PM +0100, Olivier Smedts wrote: http://people.freebsd.org/~pjd/patches/zfs_20100831.patch.bz2 Hello, Any status update on this ? I regularly check http://people.freebsd.org/~pjd/patches/ to see if there's an updated version of your patch. 2 months old is quite a bit for -CURRENT, which often receives commits on zfsco parts. Thanks for all your work on FreeBSD (not only ZFS). It took a while, but I should have something new shortly. I recently finished boot support for v28 (the most missing feature in the previous patch?) and will work on new patch soon. I'm heading to meetBSD California tomorrow and I'll be back in a week, so nothing will happen till then for sure. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpPnD9csrFCZ.pgp Description: PGP signature
Re: letting glabel recognise a media change
On Mon, Oct 11, 2010 at 11:03:26AM -0400, John Baldwin wrote: With CD drives you are also rather stuck in that the existing ABI for controlling CD drives (e.g. ioctls in 3rd party software to eject a CD) are done on the /dev/cdX device. Ideally enclosures for removable media would be separate devices from the removable media itself, but a lot of existing software for CD's would break if this changes now. Right, but I still wonder if we could execute provider orphan and retaste on various events like media insertion or removal. If media is removed we orphan provider and recreate it, which will trigger retaste, and this is fine there will be nothing to read from or write to (we will simply return errors as we do now, I think). This way we nicely co-operate with GEOM, but also with other tools that don't require media to be present (if there is no media devfs entry still exists and handles ioctls, it just return errors on read requests). -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp57kBd4EwFu.pgp Description: PGP signature
Re: letting glabel recognise a media change
On Thu, Sep 30, 2010 at 08:46:11PM +0300, Alexander Motin wrote: Andriy Gapon wrote: on 30/09/2010 01:28 Matthew Jacob said the following: If something like that was in place, I assure you that things would start to use it very quickly. I am not sure about this. Because, e.g. I don't see an easy way to know that media is changed in scsi_cd driver. That is, without polling. I don't consider polling to be an easy way for a number of reasons. SATA specification defines concept of Asynchronous Notification. It is already used by port multipliers to report about PHY events. It is also supposed to be used by CD drives to report media change. I haven't seen such devices yet, but hope they may appear sometimes. And even without AN support it would be nice to implement proper handling for SCSI UA - media changed errors within CAM. It still won't be perfect without using polling, but probably still something. I'd like to know the original reason why CD device is represented by GEOM provider and not CD media. For my naive thinking CD media should be GEOM provider that we taste once the media is inserted and orphan once the media is removed. I don't see any reasons for CD device to be useful GEOM provider, but maybe I'm overlooking something. Poul-Henning or Soren, do you remember who made and why this design choice? -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpbCmI9YvYaB.pgp Description: PGP signature
Recent GELI additions.
Hi. I'd like to inform about three new features in GELI available in HEAD: 1. AES-XTS encryption. XTS mode is a standard that is recommended these days for storage encryption. This is the default now. AES-XTS support was also added to opencrypto framework and aesni(4) driver. 2. Multiple encryption keys. GELI will use one encryption key for at most 2^20 blocks (sectors), as it is not recommended to use the same encryption key for too much data. It generates keys array from the master key on attach and uses it accordingly. This is the default now. 3. Passphrase can now be loaded from a file (-J and -j options). -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpKbX8P352EG.pgp Description: PGP signature
Re: gptboot rewrite, bootonce, etc.
On Mon, Sep 20, 2010 at 09:46:56AM +0100, krad wrote: does it work for zfs boot as that would be really nice if it did? No, it doesn't. ZFS works a bit differently. ZFS operate on pools, not really on partitions. One ZFS file system can span multiple disks/partitions. I'm not yet sure how to implement it, so it is intuitive, but I also haven't spend much time thinking about it. We needed UFS and that is what I implemented. It took me much more time than I expected anyway:) -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpOli8wZZAdH.pgp Description: PGP signature
Re: gptboot rewrite, bootonce, etc.
On Mon, Sep 20, 2010 at 01:17:38AM +0200, Oliver Pinter wrote: Hi PJD! Can you this patcheset release for 7-STABLE? I've no plans atm to port this work to 7-STABLE. I don't even have 7.x systems anymore. Not sure how boot code differs, maybe the patch will apply without modifications? No idea. I'd like to MFC this to 8-STABLE, though. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp1EiZmlOSUJ.pgp Description: PGP signature
Re: gptboot rewrite, bootonce, etc.
On Sun, Sep 19, 2010 at 09:10:52PM +0400, Boris Samorodov wrote: Hi! On Sat, 18 Sep 2010 01:45:42 +0200 Pawel Jakub Dawidek wrote: My company was in need for functionality similar to nextboot(8), but on boot loader level, so we can have two partitions we boot from where one is known to be good and the other is used for upgrades. We upgrade by dd(1)ing entire partition image onto unused partition, we mark it as try-to-boot-from-it-but-only-once, reboot and if we fail to boot from the new partition, we fall back to the old, good partition. If we succeed on the other hand, we mark the new partition as our boot partition and mark the other one as unused. Well, how hard can it be? After around two weeks of work, I ended up rewriting gptboot in large parts, reorganizing a lot of code, improving and extending gpart a bit and implementing desire functionality. Here is the patch for review and test: http://people.freebsd.org/~pjd/patches/gptboot.patch Great! Since I need to have both i386 and amd64 at my box here are my test results: - [~]b...@alya% uname -a FreeBSD alya 9.0-CURRENT FreeBSD 9.0-CURRENT #1 r212758M: Sat Sep 18 16:13:38 MSD 2010 b...@alya:/space/FreeBSD/base/head/obj/space/FreeBSD/base/head/src/sys/ALYA amd64 [~]b...@alya% glabel status Name Status Components gptid/c6053c9b-abcc-11df-b740-00251124aff4 N/A ad4p1 label/9-amd64 N/A ad4p2 label/swap N/A ad4p3 label/space N/A ad4p4 label/9-i386 N/A ad4p5 [~]b...@alya% mount /dev/label/9-amd64 on / (ufs, local) devfs on /dev (devfs, local, multilabel) /dev/label/space on /space (ufs, local) /dev/md0 on /tmp (ufs, local, nosuid, soft-updates) procfs on /proc (procfs, local) linprocfs on /compat/linux/proc (linprocfs, local) linsysfs on /compat/linux/sys (linsysfs, local) fdescfs on /dev/fd (fdescfs) [~]b...@alya% gpart show = 34 490234685 ad4 GPT (234G) 341281 freebsd-boot (64K) 162 419430402 freebsd-ufs (20G) 4194320283886083 freebsd-swap (4.0G) 50331810 2097152004 freebsd-ufs (100G) 260047010 419430405 freebsd-ufs (20G) 301990050 188244669 - free - (90G) [~]b...@alya% gpart set -a bootme -i 2 ad4 bootme set on ad4p2 [~]b...@alya% gpart set -a bootonce -i 5 ad4 bootonce set on ad4p5 [~]b...@alya% gpart show = 34 490234685 ad4 GPT (234G) 341281 freebsd-boot (64K) 162 419430402 freebsd-ufs [bootme] (20G) 4194320283886083 freebsd-swap (4.0G) 50331810 2097152004 freebsd-ufs (100G) 260047010 419430405 freebsd-ufs [bootonce,bootme] (20G) 301990050 188244669 - free - (90G) - Install i386 kernel/world to ad4p5, successful reboot, get i386 system. Next reboot (get amd64 system back): - [~]b...@alya% gpart show = 34 490234685 ad4 GPT (234G) 341281 freebsd-boot (64K) 162 419430402 freebsd-ufs [bootme] (20G) 4194320283886083 freebsd-swap (4.0G) 50331810 2097152004 freebsd-ufs (100G) 260047010 419430405 freebsd-ufs (20G) 301990050 188244669 - free - (90G) - All seems to work fine. Great, thanks for testing! Any comments or suggestions? Only one for now. With current default syslog configuration logging to local0.warning and local0.info goes nowhere. It will be good if those messages have traces at the default system. Good point. I changed those to local0.notice. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpK71ho4UC6u.pgp Description: PGP signature
gptboot rewrite, bootonce, etc.
things will have to wait until I can sleep at nights again. Well, there is still dedup support that waits to be implemented in gptzfsboot... -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpm1w4OWOKIR.pgp Description: PGP signature
Re: ZFS v28 is ready for wider testing.
On Fri, Sep 03, 2010 at 04:50:44PM +0100, Peter Molnar, BSD wrote: Hi, I would like to try ZFS + VirtualBox but I have got problems: 1) Linux 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:21:58 UTC 2010 x86_64 GNU/Linux I tried import that file in my VirtualBox but I have got error: Failed to import appliance. /home/peter/FreeBSD/zfsv28.ovf Too many IDE controllers in OVF; import facility only supports one. Which VirtualBox version do you use? 3.2.8? Exporting appliances is a bit broken (if you have more than one disk, it will point all disks at the last one from configuration), so I had to edit .ovf file manually to fix this. Maybe I messed something up, but I was able to successfully import it before publishing it. PS. I waited for so long for decent virtualization software for FreeBSD, and I must say VirtualBox is really great, and free, and open-source Are you reading this, VMWare? -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgppp5WIVDzjJ.pgp Description: PGP signature
Re: ZFS v28 is ready for wider testing.
On Thu, Sep 02, 2010 at 01:55:51AM -0700, Rob Farmer wrote: On Tue, Aug 31, 2010 at 2:59 PM, Pawel Jakub Dawidek p...@freebsd.org wrote: Ok, now that I know you read everything carefully, here is the patch: http://people.freebsd.org/~pjd/patches/zfs_20100831.patch.bz2 buildworld on i386 (yes I know ZFS isn't ideal there): [...] Yes, I know about this problem, You can use attached patch or wait for full patch, which I'll be sending later today. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --- sys/cddl/compat/opensolaris/sys/atomic.h +++ sys/cddl/compat/opensolaris/sys/atomic.h @@ -39,10 +39,9 @@ #ifndef __LP64__ extern void atomic_add_64(volatile uint64_t *target, int64_t delta); extern void atomic_dec_64(volatile uint64_t *target); -extern void *atomic_cas_ptr(volatile void *target, void *cmp, void *newval); #endif #ifndef __sparc64__ -extern uint64_t atomic_cas_32(volatile uint32_t *target, uint32_t cmp, +extern uint32_t atomic_cas_32(volatile uint32_t *target, uint32_t cmp, uint32_t newval); extern uint64_t atomic_cas_64(volatile uint64_t *target, uint64_t cmp, uint64_t newval); @@ -119,21 +118,19 @@ } #ifndef COMPAT_32BIT -#if defined(__LP64__) +#ifdef __LP64__ static __inline void * atomic_cas_ptr(volatile void *target, void *cmp, void *newval) { - return ((void *)atomic_cas_64((volatile uint64_t *)target, (uint64_t)cmp, - (uint64_t)newval)); + return ((void *)atomic_cas_64(target, (uint64_t)cmp, (uint64_t)newval)); } #else static __inline void * atomic_cas_ptr(volatile void *target, void *cmp, void *newval) { - return ((void *)atomic_cas_32((volatile uint64_t *)target, (uint64_t)cmp, - (uint64_t)newval)); + return ((void *)atomic_cas_32(target, (uint32_t)cmp, (uint32_t)newval)); } #endif -#endif +#endif /* !COMPAT_32BIT */ #endif /* !_OPENSOLARIS_SYS_ATOMIC_H_ */ pgppo82knRdQW.pgp Description: PGP signature
Re: ZFS v28 is ready for wider testing.
On Tue, Aug 31, 2010 at 11:59:15PM +0200, Pawel Jakub Dawidek wrote: [...] Ok, now that I know you read everything carefully, here is the patch: http://people.freebsd.org/~pjd/patches/zfs_20100831.patch.bz2 Now it is even easier to test new ZFS! :) Here you can find VirtualBox Appliance (113MB) with FreeBSD 9-CURRENT and ZFSv28: http://people.freebsd.org/~pjd/misc/FreeBSD9_ZFSv28_0.1.tgz Untar it, import it (zfsv28.ovf) to VirtualBox and have fun. You can log in as root with no password (via virtual console or via SSH). The system IP address is IP 192.168.56.66/24. There are 16 ada(4) disks to play with. For example: zfsv28:root:~# zpool create tank raidz3 ada{0,1,2,3,4,5,6,7} raidz3 ada{8,9,10,11,12,13,14,15} zfsv28:root:~# zpool status pool: tank state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 raidz3-0 ONLINE 0 0 0 ada0ONLINE 0 0 0 ada1ONLINE 0 0 0 ada2ONLINE 0 0 0 ada3ONLINE 0 0 0 ada4ONLINE 0 0 0 ada5ONLINE 0 0 0 ada6ONLINE 0 0 0 ada7ONLINE 0 0 0 raidz3-1 ONLINE 0 0 0 ada8ONLINE 0 0 0 ada9ONLINE 0 0 0 ada10 ONLINE 0 0 0 ada11 ONLINE 0 0 0 ada12 ONLINE 0 0 0 ada13 ONLINE 0 0 0 ada14 ONLINE 0 0 0 ada15 ONLINE 0 0 0 errors: No known data errors -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp3nDIzwUUuC.pgp Description: PGP signature
Re: ZFS v28 is ready for wider testing.
On Tue, Aug 31, 2010 at 11:59:15PM +0200, Pawel Jakub Dawidek wrote: Ok, now that I know you read everything carefully, here is the patch: http://people.freebsd.org/~pjd/patches/zfs_20100831.patch.bz2 Important note. Please patch with the following command: # patch -E -p0 zfs_20100831.patch If you don't use -E option, patch(1) won't remove empty files and you won't be able to compile it. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgplMh4YH3ZOH.pgp Description: PGP signature
ZFS v28 is ready for wider testing.
Hello. I'd like to give you ZFS v28 for testing. If you are neither brave nor mad, you can stop here. The patchset is very experimental. It can eat your cookie and hurt your teddy bear, so be warned. Don't try it for anything except testing. This patchset is also a message we, as the FreeBSD project, would like to send to our users: Eventhough OpenSolaris is dead, the ZFS file system is going to stay in FreeBSD. At this point we have quite a few developers involved in ZFS on FreeBSD as well as serveral companies. We are also looking forward to work with IllumOS. So, what this new ZFS brings? - Data deduplication. Read more here: http://blogs.sun.com/bonwick/entry/zfs_dedup - Triple parity RAIDZ (RAIDZ3). Read more here: http://dtrace.org/blogs/ahl/2009/07/21/triple-parity-raid-z/ - zfs diff. Read more here: http://arc.opensolaris.org/caselog/PSARC/2010/105/20100328_tim.haley - zpool split. Read more here: http://arc.opensolaris.org/caselog/PSARC/2009/511/20090924_mark.musante - Snapshot holds. Read more here: http://arc.opensolaris.org/caselog/PSARC/2009/297/20090511_chris.kirby - zpool import -F. Allows to rewind corrupted pool to earlier transaction group. - Possibility to import pool in read-only mode. And much, much more, including plenty of preformance improvements and bug fixes. So test whatever you can and report back. Look for regressions, strange behaviour, missing features, deadlocks, livelocks, preformance degradation, etc. The boot code is not updated at all, so booting off of ZFS doesn't currently work. The patch is against today's FreeBSD HEAD. The patch enables (in sys/modules/zfs/Makefile) ZFS internal debugging, please don't turn it off. Also, compile your kernel with the following options: options KDB options DDB options INVARIANTS options INVARIANT_SUPPORT options WITNESS options WITNESS_SKIPSPIN options DEBUG_LOCKS options DEBUG_VFS_LOCKS Ignore all the LOR (Lock Order Reversal) reports from WITNESS. There will be plenty of those, and you'll desperately want to report them, but please don't. The best way to report a problem is to answer to this e-mail with as short as possible procedure of how to reproduce it and debugging info. I'd prefer textdump if possible. Below you can find quick procedure how to setup textdumps: Choose spare/swap disk/partition in your system, let's say it is /dev/ad0s1b. Add the following line to /etc/fstab: /dev/ad0s1b noneswapsw 0 0 Add the following line to /etc/rc.conf: ddb_enable=YES Run the following commands: # /etc/rc.d/swap1 start # /etc/rc.d/dumpon start # /etc/rc.d/ddb start This will setup swap, mark it as dump device and setup some DDB scripts. Or you can just reboot. Now when your system panic or deadlock, enter DDB and call the following command: ddb run kdb.enter.panic It will execute all the commands I need, dump them in text format to your swap device and reboot machine. After the reboot, you should find textdump.tar.0 file in /var/crash/ directory. This is the debug info I need. End of textdumps procedure. Ok, now that I know you read everything carefully, here is the patch: http://people.freebsd.org/~pjd/patches/zfs_20100831.patch.bz2 Good luck! : -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpGVyTUV4RIm.pgp Description: PGP signature
Re: [CFT] Improved ZFS metaslab code (faster write speed)
On Sat, Aug 28, 2010 at 05:03:42AM -0400, jhell wrote: On 08/28/2010 04:20, Andriy Gapon wrote: on 28/08/2010 04:24 jhell said the following: The modified patch from avg@ (portion patch) is: #ifdef _KERNEL if (arc_reclaim_needed()) { needfree = 0; wakeup(needfree); } #endif I still moved that down to below _KERNEL for the obvious reasons. But when I was using the original patch with if (needfree) I noticed a performance degradation after ~12 hours of use with and without UMA turned on. So far with ~48 hours of testing with the top half of that being with the above change, I have not seen more degradation of This is quite unexpected. needfree should be checked as the very first thing in arc_reclaim_needed() [unless you have patched it locally]. So if needfree is 1 then arc_reclaim_needed() should also return 1. But the converse is not true, arc_reclaim_needed() may return 1 even if needfree is zero. So if your testing results are conclusive then it must mean that some extra wakeups on needfree are needed. I.e. needfree is zero, so there shouldn't be anything waiting on it (see arc_lowmem) and no notification should be needed, but issuing somehow does make difference, Hmm... I will look further into this and see if I can throw a counter around it or some printf's so I can at least log what its doing in both instances. I thought the very same thing you said above when I saw your patch for that and was astounded at the results that were returned from it. So in short testing I reverted it back quickly to see if that was the cause of the problem and sure enough everything resumed to the way it was before. Anyway thanks for the reply. I will get back to you if I see anything cool arise from this. Could you include the following patch to your testing: http://people.freebsd.org/~pjd/patches/arc.c.9.patch -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpomIv4VGZ52.pgp Description: PGP signature
Re: Mounting cd9660 multiple times gives EBUSY [Was: unionfs a little improvement]
On Wed, Aug 18, 2010 at 12:48:53PM +0200, Ed Schouten wrote: Hi Daichi, I think Keith Packard of Xorg once wrote a commit message along the lines of 5000 lines of code removed, feature added This seems to be similar, albeit on a smaller scale. ;-) Apart from this issue with unionfs, I am also experiencing another issue, where for some reason I cannot perform a second mount of the CD right after booting the system. Basically, my WIP FreeBSD boot CD does the following (but written in C): mount -t cd9660 /dev/iso9660/freebsd /mnt mount -t tmpfs none /tmp mount -t unionfs /tmp /mnt mount -t devfs none /mnt/dev chroot /mnt /sbin/init The first step fails with EBUSY. I use the following hack to get it working, but I don't think it's the proper way to solve it: What you are trying to do here is to mount /dev/iso9660/freebsd for the second time? This is not supported. The check is there to prevent doing this, as it will panic on you when you try to unmount first mount (not really a problem in your case, as the first mount is /, so you probably don't want to unmount it, but it is a problem in general). You should be able to reproduce the panic with your patch applied by doing the following: # mount -t cd9660 /dev/iso9660/freebsd /mnt0 # mount -t cd9660 /dev/iso9660/freebsd /mnt1 # umount /mnt0 -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp88NLmz310d.pgp Description: PGP signature
Re: glabel force sectorsize patch
On Sun, Aug 08, 2010 at 02:02:17PM +0200, Ivan Voras wrote: On 8.8.2010 12:30, Pawel Jakub Dawidek wrote: So why do you want to obfuscate glabel with it? For people to start depend on it? Once we start supporting 4kB sectors what do we do with such a change? Remove it and decrease version number? What people will do with providers already labeled this way? If its temporary, just allow to list providers you want to increase sector size in /boot/loader.conf. Once we start supporting it properly people might simply remove it from loader.conf and it should just work. Glabel is not for that and I don't agree for such obfuscation. Of course, there are good and bad sides to it. My take on it is that the only bad side is that it really isn't glabel's primary function to (optionally) fixup geometry, while the good sides are: It isn't its secondary function either. * glabel is in GENERIC and judging by the mailing lists' traffic it is one of the better used parts of the system so people are familiar with it. It is also already used as a perfectly valid fixup for device renaming, making both UFS and ZFS more stable for usage. That's an excellent argument. But you know what? The em(4) is also in GENERIC, why not to add it in there? * You can't really make people depend on glabel both because it is in GENERIC and because of it storing metadata in the last sector, making the rest of the drive completely usable without it in the event native 4k sector support is grown. I never said that. I do want people to depend on glabel, because it is free of such ugly hacks, so I know it won't bite them in the future. I don't want people to start depend on the fact that glabel supports changing sector sizes. Once we start supporting 4kB sectors properly people configuration will stop working, because glabel won't be able to read its metadata anymore. Your hack will break all configurations that started to depend on your hack. In what I proposed, GEOM provider will be presented to glabel (or any other GEOM class) as 4kB provider and everything will just work, also after adding proper support for 4kB sectors. I'd like to hear comments from the wider audience. In respect with your comment, I will compromise: as 4k sector drives have become available over the counter more than 6 months ago and so far I think this is the first effort to give some support for them, I will commit this patch before 9.0 code freeze only if no other support gets developed. I'll repeat. You won't commit this patch, because it is totally wrong solution and can only do a lot of damage in the future. If you look forward, even temporary solutions can be done right. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpxLQFRxU0ja.pgp Description: PGP signature
Re: glabel force sectorsize patch
On Sun, Aug 08, 2010 at 02:57:20PM +0200, Marius Nünnerich wrote: On Sun, Aug 8, 2010 at 14:02, Ivan Voras ivo...@freebsd.org wrote: I'd like to hear comments from the wider audience. In respect with your comment, I will compromise: as 4k sector drives have become available over the counter more than 6 months ago and so far I think this is the first effort to give some support for them, I will commit this patch before 9.0 code freeze only if no other support gets developed. I do not like this at all. Even if it's just for the KISS and POLA principles. A geom should do one thing and do it right imo. Why not write a new geom class that does what you want? New GEOM class only for sectorsize conversion that can operate on metadata will be useful, not only to solve this particular problem. Although keep in mind that if at some point disks will be detected and presented as 4kB providers to the GEOM, this class won't be able to find its metadata anymore (as it was stored in the last 512 bytes, not in the last 4 kilobytes). -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpMenhUo3zq1.pgp Description: PGP signature
Re: AESNI driver and fpu_kern KPI
On Sat, May 15, 2010 at 01:04:01PM +0300, Kostik Belousov wrote: Hello, please find at http://people.freebsd.org/~kib/misc/aesni.1.patch the combined patch, containing the fpu_kern KPI and Intel AESNI crypto(9) driver. I did development and some testing on the hardware generously provided by Sentex Communications to Netperf cluster. Nice work. Few comments: - Could you modify this chunk in padlock.c: + td = curthread; + error = fpu_kern_enter(td, ses-ses_fpu_ctx); + if (error != 0) + goto out; error = padlock_hash_setup(ses, macini); + fpu_kern_leave(td, ses-ses_fpu_ctx); + out: To something without goto, eg.: td = curthread; error = fpu_kern_enter(td, ses-ses_fpu_ctx); if (error == 0) { error = padlock_hash_setup(ses, macini); fpu_kern_leave(td, ses-ses_fpu_ctx); } - I see that in sys/dev/random/nehemiah.c you don't check for return value of fpu_kern_enter(). That's the only place where you ignore it. Is that intended? - Unfortunately the driver in its current version can't be used with IPsec and with GELI where authentication is enabled. This is because the driver doesn't support sessions where both encryption and authentication is defined. Do you have plans to change it? I saw that you based crypto(9) bits on padlock, which does support sessions with authentication by calculating hashes in software. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgptFXEkt9czc.pgp Description: PGP signature
Re: Switchover to CAM ATA?
On Mon, Apr 26, 2010 at 10:33:27AM -0600, M. Warner Losh wrote: I've read most of this thread. I think this is cool technology. However, before we move forward with this, we need to have a plan for the various issues that have come up. The plan needs to be specific, have owners for key items, warnings about ownerless == obsoleted, and target dates. I think this is one of the cases where we should record the plan of record on a wiki. It worked well for other times we've had big, disruptive changes. My opinion for the path forward: (1) Send a big heads up about the future of ataraid(5). It will be shot in the head soon, to be replaced be a bunch of geom classes for each different container format. At least that seems to be the rough consensus I've seen so far. We need worker bees to do many of these classes, although much can be mined from the ataraid code today. This shouldn't be a bunch of GEOM classes. This should one class which recognize multiple formats, just like the LABEL class. I don't think it is feasible to reuse gmirror for that, it wasn't designed in something like this in mind. (2) Send another big heads up strongly recommending people go to glabel based fstabs. Maybe the right option here is to provide a simple script walk people through the conversion. This will render the carnage of ad - ada (or da) a mostly non-event, and also protect people from 'oops' of rebooting with that thumb drive in the system. (3) Create a wiki to record all the new geom classes needed. Find people to own each one, or note it is unowned, and support will be dropped if no owner can be found. (4) sysinstall should default to creating label systems, if it doesn't already. (5) Issues with glabel and ataraid(5) need an owner, and need to be resolved, since the device names here are likely to change. What are the issues? -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp9zbeI5WsV4.pgp Description: PGP signature
Re: Switchover to CAM ATA?
On Mon, Apr 26, 2010 at 12:19:46PM -0600, M. Warner Losh wrote: In message: 20100426181209.gb3...@garage.freebsd.pl Pawel Jakub Dawidek p...@freebsd.org writes: : On Mon, Apr 26, 2010 at 10:33:27AM -0600, M. Warner Losh wrote: : I've read most of this thread. I think this is cool technology. : However, before we move forward with this, we need to have a plan for : the various issues that have come up. The plan needs to be specific, : have owners for key items, warnings about ownerless == obsoleted, and : target dates. : : I think this is one of the cases where we should record the plan of : record on a wiki. It worked well for other times we've had big, : disruptive changes. : : My opinion for the path forward: : (1) Send a big heads up about the future of ataraid(5). It will be : shot in the head soon, to be replaced be a bunch of geom classes : for each different container format. At least that seems to be : the rough consensus I've seen so far. We need worker bees to do : many of these classes, although much can be mined from the ataraid : code today. : : This shouldn't be a bunch of GEOM classes. This should one class which : recognize multiple formats, just like the LABEL class. : I don't think it is feasible to reuse gmirror for that, it wasn't : designed in something like this in mind. OK. Maybe I got the consensus wrong... My key point is that we need a plan moving forward, we need to identify what's actively being worked on vs somebody else[tm] should do tihs and when it needs to be done or else. You most likely got it right, I'm just saying creating separate GEOM class for each metadata format is wrong direction. :) : (5) Issues with glabel and ataraid(5) need an owner, and need to be : resolved, since the device names here are likely to change. : : What are the issues? ataraid doesn't remove the underlying ad* devices, so glabel often picks those up instead of the ataraid device, and you only get 1 disk's worth of raid device... So no mirroring or only 1/2 a striped volume. It not only leave ad* devices, it doesn't even open them properly using GEOM. It's internal ATA hack, which is PITA. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpC74JvN8hWL.pgp Description: PGP signature
Re: ZFS behavior when device disappears
On Tue, Apr 13, 2010 at 05:39:30PM -0600, Jason J. W. Williams wrote: Hello, Currently, we're an OpenSolaris shop but with the way things are going over at Oracle/Sun we're starting to evaluate our options for keeping ZFS but moving off Solaris. One of my concerns is that FreeBSD is implementing ZFSv14 (ZFS itself is up to v23 I believe). For quite a long time, ZFS under Solaris had a real problem with the following scenario: * Hard drive starts to die * Controller and SCSI subsystem continue to retry an I/O rather than failing fast * Even if the I/O does fail fast ZFS doesn't really notice a spike in I/O failures and continues to use the drive. * Result: I/O on the zpool stalls completely while the I/Os continue to be tried against the drive. This got fixed in later revs of OpenSolaris by enhancements to ZFS and greater integration with the Fault Management Architecture (FMA) of Solaris...lots of I/Os failing on a drive get communicated to ZFS who then offlines the drive out of the pool. My question is, what is the situation in FreeBSD 8 with ZFS if that type of situation occurs? I believe FreeBSD does whatever OpenSolaris did for this version of ZFS. There is nogoing work to bring v24 to FreeBSD. Basic functionality works already, but a lot work is still needed. At some point I'll see what we can do about it, because we don't have FMA in FreeBSD and we would need to find another way to deal with it. I've limited time I can spend on ZFS right now, so I'm making small steps, but I'm making good progress too. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpVisqFmsp2w.pgp Description: PGP signature
Re: ZFS behavior when device disappears
On Tue, Apr 20, 2010 at 07:24:53AM -0600, Jason J. W. Williams wrote: Hi Pawel, Thank you very much for the response! Please forgive some of my questions, as I'm a bit unfamiliar with the FreeBSD port. What is the nature of the port? Is it something where each new version of ZFS is a from-scratch effort to some degree? Or is it a point where new ZFS versions are a matter of just making the newer features operational? Definitely the latter, but there some problems: - Some changes in OpenSolaris ZFS are very hard to port in short time, and when it takes a lot of time, new versions arrive and it is nice to get them too, etc. which makes whole process to take long time. Good example here is moving some functionality to Python, where we have to decided what to do about that without importing Python to the base system. - OpenSolaris ZFS is experimental and I don't think Solaris version is published anywhere. This means it needs extensive testing on our side, which of course takes time. - OpenSolaris changes are often not easy to understand. They have different commit rules than we have. Commit logs are not very helpful and multiple fixes are committed in one go, which makes it hard to separate individual changes if we just need a fix and not intrusive change that came along. I'm doing my best, but my time is limited. I see more and more people are interested in helping with ZFS, which is a very good sign I was waiting for for a long time:) It is of course still wonderful that we can use ZFS. All my servers and my laptop are running exclusively on ZFS at this point:) -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpM8JNKN6bFd.pgp Description: PGP signature
Re: Increasing MAXPHYS
On Mon, Mar 22, 2010 at 08:23:43AM +, Poul-Henning Kamp wrote: In message 4ba633a0.2090...@icyb.net.ua, Andriy Gapon writes: on 21/03/2010 16:05 Alexander Motin said the following: Ivan Voras wrote: Hmm, it looks like it could be easy to spawn more g_* threads (and, barring specific class behaviour, it has a fair chance of working out of the box) but the incoming queue will need to also be broken up for greater effect. According to notes, looks there is a good chance to obtain races, as some places expect only one up and one down thread. I haven't given any deep thought to this issue, but I remember us discussing them over beer :-) The easiest way to obtain more parallelism, is to divide the mesh into multiple independent meshes. This will do you no good if you have five disks in a RAID-5 config, but if you have two disks each mounted on its own filesystem, you can run a g_up g_down for each of them. A class is suppose to interact with other classes only via GEOM, so I think it should be safe to choose g_up/g_down threads for each class individually, for example: /dev/ad0s1a (DEV) | g_up_0 + g_down_0 | ad0s1a (BSD) | g_up_1 + g_down_1 | ad0s1 (MBR) | g_up_2 + g_down_2 | ad0 (DISK) We could easly calculate g_down thread based on bio_to-geom-class and g_up thread based on bio_from-geom-class, so we know I/O requests for our class are always coming from the same threads. If we could make the same assumption for geoms it would allow for even better distribution. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpFAxWFcI5ds.pgp Description: PGP signature
Re: check for jailed environment for adjkerntz
On Mon, Mar 01, 2010 at 02:15:41AM +0300, Subbsd wrote: jail with complete type have standard crontab a file of tasks. However not all standard task are adapted for work in jail an environment. For example adjkerntz which generates adjkerntz [46733]: sysctl (set: machdep.wall_cmos_clock): Operation not permitted I suggest to give adjkerntz concept about jail in which to it it is not necessary to work: [...] I also always was finding that annoying, but only your e-mail made me to think about ways to fix it and that maybe simple patch like the one below will do? --- etc/crontab (wersja 204363) +++ etc/crontab (kopia robocza) @@ -22,4 +22,4 @@ # # Adjust the time zone if the CMOS clock keeps local time, as opposed to # UTC time. See adjkerntz(8) for details. -1,31 0-5 * * * rootadjkerntz -a +1,31 0-5 * * * root[ `sysctl -n security.jail.jailed` -eq 0 ] adjkerntz -a -- Pawel Jakub Dawidek http://www.wheel.pl p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpYvDwD944Ze.pgp Description: PGP signature
HAST (Highly Available Storage) now in HEAD.
Hi. Yesterday I committed HAST to the HEAD branch. HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total. HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD. For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as: http://wiki.FreeBSD.org/HAST On the wiki page above you should find instructions how to initialize hast and integrate it with ucarp. Let me know (using freebsd...@freebsd.org mailing list) if you have and questions or comments. And last, but not least, I'd like to thank sponsorswho made this projects possible: The FreeBSD Foundation, http://www.freebsdfoundation.org OMCnet Internet Service GmbH, http://www.omc.net TransIP BV, http://www.transip.nl -- Pawel Jakub Dawidek http://www.wheel.pl p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpXW0Rd7BO2p.pgp Description: PGP signature
Re: ZFS: statfs and recordsize problem
On Thu, Feb 18, 2010 at 03:39:28PM +0300, Alexander Zagrebin wrote: I have noticed, that statfs called for ZFS file systems, returns the value of FS's recordsize property in both f_bsize and f_iosize. It's a problem for some software. For example, squid uses block size of cache's file system to calculate the space occupied by file. So by default it considers that any small file uses 128KB of a cache (when default value of recordsize is used), though really this file may use 512B only. This miscalculation leads to unreasonable cleaning of a cache. IMHO the behavior of statfs have to be changed, as ZFS uses variable (up to recordsize) block sizes. It must return 512 as f_bsize and recordsize as f_iosize. One of possible solutions is the attached patch. Could somebody look it? I committed (slightly modified version of) your patch to HEAD. Thanks! -- Pawel Jakub Dawidek http://www.wheel.pl p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp67WCYnRd70.pgp Description: PGP signature
Re: jail and emulators/linux_base
On Wed, Dec 03, 2003 at 10:22:16AM +0100, Niklas Saers Mailinglistaccount wrote: + I'm running CURRENT and set up a jail where I want to install SUN JDK + 1.4.2. In the process, linux emulation needs to be installed. While + installing emulators/linux_base, I get the following: + + === Installing for linux_base-7.1_5 + Un-mounting linprocfs... + umount: retrying using path instead of file system ID + === Generating temporary packing list + === Checking if emulators/linux_base already installed + mknod: /compat/linux/dev/null: Operation not permitted + *** Error code 1 + + While Linux-emulation is already up and running on the host-machine, it + seems the jail is not allowed to create what it needs to run it. I + understand allowing mknod(8) within a jail is dangerous in the case where + you allow untrusted users to be root. Is there some way to either say I + don't let untrusted users be root thus allowing this or to compile + emulators/linux_base more jail-friendly, possibly setting things up from + outside the jail? Erm. You may install it using chroot(8) only and then run jail with the same path. You may also use chroot(8) instead of jail if you're looking for full functionality. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Panic: if_simloop: attempted use of a free mbuf!
Hello. I'm reaching assertion from /sys/net/if_loop.c:270. This is very easy to reproduce: First you need to put loopback into promiscuous mode: # tcpdump -i lo0 Then try to connect to loopback, for example: # telnet 127.0.0.1 22 Enjoy!:) -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: panic: sleeping without a mutex (acd related)
On Tue, Nov 25, 2003 at 11:21:03AM +0100, Christian Laursen wrote: + I have been experiencing some random lockups after upgrading from + 5.1-RELEASE to 5.2-BETA. + + I then wen on and enabled all the debug options in my kernel config + hoping to be able to find the cause. + + But now I cannot boot at all. In the end of the boot process when + detecting ATA drives, I get this: + + ad0: 76319MB ST380011A [155061/16/63] at ata0-master UDMA100 + acd0-5: CDROM with 6 CD changer CD-C68E at ata1-master PIO4 + acd6: DVDROM CREATIVEDVD5240E-1 at ata1-slave PIO4 + panic: sleeping without a mutex + Debugger(panic) + Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 + db + db trace + Debugger(c06e3744,c07549a0,c06e3ec9,d861ab60,100) at Debugger+0x54 + panic(c06e3ec9,0,c06e3eb8,c06d6584,10) at panic+0xd5 + msleep(c45173d8,0,4c,c06d6584,0) at msleep+0x505 + acd_geom_access(c452de00,1,0,0,0) at acd_geom_access+0x115 Yeah. There are two calls of tsleep(9) without timeout set (in line 499, 509), so this KASSERT is reached: KASSERT(timo != 0 || mtx_owned(Giant) || mtx != NULL, (sleeping without a mutex)); -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Panic after mount() fail.
Hello. There is a problem with mount(2) failures. It can cause panics. How-to-repeat. # dd if=/dev/random of=/test.img bs=1m count=8 # mdconfig -a -t vnode -f /test.img -u 25 # mkdir -p /mnt/test # mount /dev/md25 /mnt/test (fail) # mount /dev/md25 /mnt/test (panic Memory modified after free ...) This is because on failure mutex is not destroyed. Patch: --- vfs_mount.c.origSun Nov 16 15:46:56 2003 +++ vfs_mount.c Sun Nov 16 15:21:48 2003 @@ -1061,6 +1061,7 @@ update: vfs_unbusy(mp, td); else { mp-mnt_vfc-vfc_refcount--; + mtx_destroy(mp-mnt_mtx); vfs_unbusy(mp, td); #ifdef MAC mac_destroy_mount(mp); @@ -1142,6 +1143,7 @@ update: vp-v_iflag = ~VI_MOUNT; VI_UNLOCK(vp); mp-mnt_vfc-vfc_refcount--; + mtx_destroy(mp-mnt_mtx); vfs_unbusy(mp, td); #ifdef MAC mac_destroy_mount(mp); -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
LOR (ffs_snapshot.c:651 vm_map.c:2258).
Hello. lock order reversal 1st 0xc66a6db0 vnode interlock (vnode interlock) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:651 2nd 0xc0c2f110 system map (system map) @ /usr/src/sys/vm/vm_map.c:2258 Stack backtrace: backtrace(c05bbfcb,c0c2f110,c05c650b,c05c650b,c05c6581) at backtrace+0x17 witness_lock(c0c2f110,8,c05c6581,8d2,c0c2f0b0) at witness_lock+0x686 _mtx_lock_flags(c0c2f110,0,c05c6581,8d2,c6aee000) at _mtx_lock_flags+0xb5 _vm_map_lock(c0c2f0b0,c05c6581,8d2,c69e61b0,0) at _vm_map_lock+0x36 vm_map_remove(c0c2f0b0,c6aee000,c6af,e1b1a7f0,c0555f99) at vm_map_remove+0x30 kmem_free(c0c2f0b0,c6aee000,2000,e1b1a80c,c05579f9) at kmem_free+0x32 page_free(c6aee000,2000,22,c060c4b8,c05e9100) at page_free+0x3a uma_large_free(c69e61b0,e1b1a83c,c0487f64,c66a6db0,2000) at uma_large_free+0xf9 free(c6aee000,c05e9100,c05c3358,28b,c25aff00) at free+0xe9 ffs_snapshot(c6522600,80c39a0,70,c04b5d36,c060d3e0) at ffs_snapshot+0x23f4 ffs_mount(c6522600,c69c4380,bfbffcc0,e1b1abf0,c6496720) at ffs_mount+0x617 vfs_mount(c6496720,c258ecd0,c69c4380,1211000,bfbffcc0) at vfs_mount+0x7d1 mount(c6496720,e1b1ad14,c05cd44e,3ee,4) at mount+0xba syscall(2f,2f,2f,0,bfbffdc0) at syscall+0x28f Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (21), eip = 0x80557bb, esp = 0xbfbffb6c, ebp = 0xbfbffd48 --- -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Panic (in_pcb.c:866).
Hello. I got this panic while doing 'killall -9 ppp' on FreeBSD 5-CURRENT (kernel from October 31st): panic: mtx_lock() of spin mytex @ /usr/src/sys/netinet/in_pcb.c:866 [...] db trace [...] Debugger [...] panic [...] _mtx_lock_flags [...] [...] in_losing+0x40 [...] tcp_timer_rexmt+0x23e [...] softclock+0x1ad [...] ithread_loop+0x177 [...] fork_exit+0xb5 [...] fork_trampoline+0x8 [...] -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Panic (route.c:99).
Hello. Kernel from October 31st, while doing 'killall ssh'. panic: mtx_lock() of spin mutex @ /usr/src/sys/net/route.c:99 db trace Debugger [...] panic [...] _mtx_lock_flags [...] [...] rtalloc_ign+0x4b [...] rtalloc+0x19 [...] tcp_rtlookup+0x39 [...] tcp_gettaocache+0x11 [...] tcp_output+0x161 [...] tcp_usr_shutdown+0xb2 [...] soshutdown+0x42 [...] shutdown+0x6c [...] syscall+0x28f [...] Xint0x80_syscall+0x1d -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
LOR (rtsock.c:388 route.c:133).
Hello. Simlar one was reported but not exactly this one: lock order reversal 1st 0xc4516e90 rtentry (rtentry) @ /usr/src/sys/net/rtsock.c:388 2nd 0xc43e327c radix node head (radix node head) @ /usr/src/sys/net/route.c:133 Stack backtrace: backtrace(c05c6672,c43e327c,c05cb651,c05cb651,c05cb6a7) at backtrace+0x17 witness_lock(c43e327c,8,c05cb6a7,85,c437c300) at witness_lock+0x686 _mtx_lock_flags(c43e327c,0,c05cb6a7,85,c4516c90) at _mtx_lock_flags+0xb4 rtalloc1(c4539a6c,1,1,435,0) at rtalloc1+0x74 rt_setgate(c4516e00,c437c300,c4539a6c,184,0) at rt_setgate+0x23c route_output(c1926700,c44f8dd0,8c,c1926700,1f74) at route_output+0x674 raw_usend(c44f8dd0,0,c1926700,0,0) at raw_usend+0x76 rts_send(c44f8dd0,0,c1926700,0,0) at rts_send+0x35 sosend(c44f8dd0,0,e8916c80,c1926700,0) at sosend+0x429 soo_write(c4497374,e8916c80,c452d480,0,c446b4c0) at soo_write+0x92 dofilewrite(c446b4c0,c4497374,2,bfbfeab0,8c) at dofilewrite+0xe3 write(c446b4c0,e8916d14,c05d6e5b,3f0,3) at write+0x6f syscall(2f,2f,2f,2,3) at syscall+0x28f Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (4), eip = 0x2826e173, esp = 0xbfbfe89c, ebp = 0xbfbfe8c8 --- -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: HEADSUP: MPSAFE network drivers
On Wed, Oct 29, 2003 at 11:52:48AM -0700, Sam Leffler wrote: + I'm committing changes to mark various network drivers' interrupt handlers + MPSAFE. To insure folks have a way to backout if they hit problems I've also + added a tunable that lets you disable this w/o rebuilding your kernel. By + default all network drivers that register an interrupt handler INTR_MPSAFE + are setup to run their ISR w/o Giant. If you want to defeat this w/o + changing the code you can set + + debug.mpsafenet=0 + + from the loader when booting and the MPSAFE bit will automatically be removed. + I plan to use this to also control forthcoming changes for registering MPSAFE + netisrs. + + The following drivers are marked MPSAFE: + + ath, em, ep, fxp, sn, wi, sis + + I've got changes coming for bge. Other drivers probably can be marked MPSAFE + but I'm only doing it for those drivers that I can test. Because there is so many drivers, maybe you could prepare some regression tests designed to check changed things. This will allow people to test your changes - it is not very easy now if we don't know what we're looking for exactly PLUS those drivers aren't marked MPSAFE by default. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
LOR (swap_pager.c:1134 vm_kern.c:328).
Hello. It was reported already? 1st 0xc0c1ede0 vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1134 2nd 0xc0c2f110 system map (system map) @ /usr/src/sys/vm/vm_kern.c:328 Stack backtrace: backtrace(c05cce28,c0c2f110,c05d72b0,c05d72b0,c05d7147) at backtrace+0x17 witness_lock(c0c2f110,8,c05d7147,148,c0c2f0b0) at witness_lock+0x686 _mtx_lock_flags(c0c2f110,0,c05d7147,148,101) at _mtx_lock_flags+0xbb _vm_map_lock(c0c2f0b0,c05d7147,148,c061a748,c061a770) at _vm_map_lock+0x36 kmem_malloc(c0c2f0b0,1000,101,c46b78bc,c056aead) at kmem_malloc+0x3a page_alloc(c0c3a3c0,1000,c46b78af,101,0) at page_alloc+0x27 slab_zalloc(c0c3a3c0,1,c0c3a3d4,8,c05d8ac1) at slab_zalloc+0xb3 uma_zone_slab(c0c3a3c0,1,c05d8ac1,68c,0) at uma_zone_slab+0xda uma_zalloc_internal(c0c3a3c0,0,1,0,c0c206b0) at uma_zalloc_internal+0x3e bucket_alloc(80,1,c05d8ac1,70b,0) at bucket_alloc+0x5e uma_zfree_arg(c0c20600,c472ebdc,0,7b6,8000) at uma_zfree_arg+0x299 swp_pager_meta_ctl(c0c1ede0,1f,0,2,c46b7a9c) at swp_pager_meta_ctl+0x10d swap_pager_unswapped(c0cbfb28,1,c05c7357,bd,c46b7a14) at swap_pager_unswapped+0x 2a vm_fault(c0d415e8,bfbff000,2,8,c154d390) at vm_fault+0x1186 trap_pfault(c46b7b0c,0,bfbffcc8,c063cee0,bfbffcc8) at trap_pfault+0x119 trap(18,10,10,bfbffcc8,c46b7bac) at trap+0x2f7 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc059d82c, esp = 0xc46b7b4c, ebp = 0xc46b7cb8 --- slow_copyout(c154d390,5,bfbffcc8,bfbffc48,0) at slow_copyout+0x4 select(c154d390,c46b7d14,c05dd181,3ed,5) at select+0x67 syscall(2f,2f,2f,bfbffcc8,1) at syscall+0x28f Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (93), eip = 0x280bbad3, esp = 0xbfbffbfc, ebp = 0xbfbffda0 --- -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
AES is broken.
Hello. After recent changes to AES, GBDE is borken. How to repeat: # mdconfig -a -t malloc -s 16M # gbde init /dev/md0 -L /etc/md0.lock # gbde attach md0 -l /etc/md0.lock # newfs -O2 /dev/md0.bde || echo BROKEN -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: GEOM_BDE
On Tue, Oct 14, 2003 at 09:49:03PM +0200, Jacek Serwatynski wrote: + I have problem with compiling my kernel. I wanted to play with gbde so i + added options GEOM_BDE.I have been doing cvsup at Tue Oct 14 20:43:17 2003 CEST + My config kernel: [...] You have to add 'device random' to your kernel config. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: GEOM_BDE
On Wed, Oct 15, 2003 at 09:56:57AM +0200, Poul-Henning Kamp wrote: + I have problem with compiling my kernel. I wanted to play with gbde so i + added options GEOM_BDE.I have been doing cvsup at Tue Oct 14 20:43:17 2003 CEST + My config kernel: + + /usr/src/sys/geom/bde/g_bde.h:180: undefined reference to `rijndael_cipherInit' + /usr/src/sys/geom/bde/g_bde.h:207: undefined reference to `rijndael_blockDecrypt' + + I had same problem until I added device random to kernel config file. + + Yes, the recent commits to the rijndael code must have messed up something No, this always was a problem. There were no chance to use BDE when 'device random' isn't compiled in kernel, but is loaded as kernel module. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
LOR (route.c:182 route.c:133).
Hello. Already reported? lock order reversal 1st 0xc47b6490 rtentry (rtentry) @ /usr/src/sys/net/route.c:182 2nd 0xc44be77c radix node head (radix node head) @ /usr/src/sys/net/route.c:133 Stack backtrace: backtrace(c05b43a7,c44be77c,c05b934c,c05b934c,c05b93a2) at backtrace+0x17 witness_lock(c44be77c,8,c05b93a2,85,c4358540) at witness_lock+0x686 _mtx_lock_flags(c44be77c,0,c05b93a2,85,246) at _mtx_lock_flags+0xb4 rtalloc1(c05dcadc,1,1,3d7,d762bb44) at rtalloc1+0x74 rt_setgate(c47b6400,c4358540,c05dcadc,c0600868,c4425000) at rt_setgate+0x264 rtredirect(c05dcacc,c05dcadc,0,6,c05dcaec) at rtredirect+0x1ad icmp_input(c192a000,14,c04b3a4a,c0600610,c0600868) at icmp_input+0x500 ip_input(c192a000,0,c05b91a8,89,0) at ip_input+0x922 netisr_processqueue(c0624a90,0,c05b91a8,e5,c190df00) at netisr_processqueue+0x8a swi_net(0,0,c05af115,215,c1916974) at swi_net+0x90 ithread_loop(c1914d80,d762bd48,c05aef87,314,c1914d80) at ithread_loop+0x177 fork_exit(c047ce45,c1914d80,d762bd48) at fork_exit+0xc2 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xd762bd7c, ebp = 0 --- -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
LOR (tcp_input.c:654 tcp_usrreq.c:621).
Hello. I'm not sure if this was reported already. lock order reversal 1st 0xc51046ec inp (inp) @ /usr/src/sys/netinet/tcp_input.c:654 2nd 0xc0642cac tcp (tcp) @ /usr/src/sys/netinet/tcp_usrreq.c:621 Stack backtrace: backtrace(c05d0e2c,c0642cac,c05d63bc,c05d63bc,c05d76ab) at backtrace+0x17 witness_lock(c0642cac,8,c05d76ab,26d,74) at witness_lock+0x671 _mtx_lock_flags(c0642cac,0,c05d76ab,26d,74) at _mtx_lock_flags+0xba tcp_usr_rcvd(c5574400,80,c05d1437,db70ca84,3b9aca00) at tcp_usr_rcvd+0x30 soreceive(c5574400,db70cac0,db70cacc,db70cac4,0) at soreceive+0x7ff nfsrv_rcv(c5574400,c7a79480,4,c5105de8,18) at nfsrv_rcv+0x87 sowakeup(c5574400,c557444c,c05d6dc0,446,108) at sowakeup+0x89 tcp_input(c1bfc800,14,c06428d4,c05f066c,db70cc48) at tcp_input+0xed1 ip_input(c1bfc800,0,c05d5bfc,89,0) at ip_input+0x81f netisr_processqueue(c0641350,0,c05d5bfc,e5,c1bc8100) at netisr_processqueue+0x8e swi_net(0,0,c05cb58d,215,c1bda974) at swi_net+0x8c ithread_loop(c1bd8d80,db70cd48,c05cb3ff,314,c1bd8d80) at ithread_loop+0x172 fork_exit(c047d9e0,c1bd8d80,db70cd48) at fork_exit+0xc0 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xdb70cd7c, ebp = 0 --- -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
GEOM Gate.
Hello hackers... Ok, GEOM Gate is ready for testing. For those who don't know what it is, they can read README: http://garage.freebsd.pl/geom_gate.README and presentation from WIP/BSDCon03 session: http://garage.freebsd.pl/GEOM_Gate.pdf After compliation (cd geom_gate; make; make install) you should run regression tests: # regression/runtests.sh If everything will went ok you can play with GEOM Gate and report any bugs. I've spend some time to made GEOM Gate force-remove-safe so using '-f' option with ggc(8) should be always safe. Ah! Four manual pages are added, so feel free to read them first (gg(4), geom_gate(4), ggc(8), ggd(8)) http://garage.freebsd.pl/geom_gate.tbz Enjoy! -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: need some debugging help
On Fri, Aug 29, 2003 at 10:03:57PM -0600, Kenneth D. Merry wrote: + I've been working on a set of patches to remove the sysctl variable creation + from interrupt context in the cd(4) and da(4) drivers. + + To fix the problem, I've created a new taskqueue that runs in a thread + context, instead of inside a software interrupt like the current task + queues. (The eventual fix will involve moving the CAM probe inside a + thread; this will provide a more temporary solution that will hopefully + also work on -stable, until we can change the CAM probe code.) + + I think I have everything setup correctly, but I keep getting panics inside + the GEOM code with these patches. (Memory modified after free.) I don't + know whether I've just exposed some race condition, or whether I've done + something wrong. + + I've seen several different panics, all with the same root cause (memory + modified after free), and with two different previous memory pools -- geom + and devbuf. I was getting same panics while I was working on GEOM Gate. After many hours of debugging I've tracked this down - I've initialized a mutex, but I haven't destroy it. As I susspect you're loading cd(4) as kld module? It seems, that you're making exactly same bug: mtx_init(kthread_mutex, taskqueue kthread, NULL, MTX_DEF); And where is mtx_destroy()? -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: need some debugging help
On Mon, Sep 01, 2003 at 02:13:45AM +0200, Pawel Jakub Dawidek wrote: + I was getting same panics while I was working on GEOM Gate. + After many hours of debugging I've tracked this down - I've initialized + a mutex, but I haven't destroy it. + + As I susspect you're loading cd(4) as kld module? No, you don't need to load it as kld module, because you initiate this mutex on every function call (and mutex is locally allocated to), so try to put mtx_destroy() on the end of this function, this should help. (I hope there is no problem with calling msleep(9) with mutex from stack) -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: need some debugging help
On Mon, Sep 01, 2003 at 12:48:41AM -0600, Kenneth D. Merry wrote: + - I tried just holding a mutex all the time, but obviously you can't +malloc while holding a mutex (except Giant), and the sysctl code does a +number of mallocs. (The original cause of this problem -- M_WAITOK +mallocs.) I've proposed some time ago changing M_WAITOK to M_NOWAIT, because function/macros responsible for sysctl creation could failed from other reasons, so I don't see any reason why they couldn't fail because of insufficient memory. Caller is obliged to check return value... -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: Lot's of SIGILL, SIGSEGV
On Sun, Aug 17, 2003 at 08:00:54PM -0700, David O'Brien wrote: + This is a FAQ. In the future, please search the archives before posting. + + At this moment in time, 'p4' isn't a safe CPUTYPE (It produces broken + code). 'p3' or 'i686' are what's recommended for Pentium 4s. + + Andre, I think you are out of date -- CPUTYPE=p4 is now safe with GCC + 3.3.1. I think he is right, because when upgrading host where was gcc3.2 to current -CURRENT (with gcc3.3) 'make world' builds make(1) in first place and it is builded by gcc3.2 with CPUTYPE=p4, so it will be broken. So gcc have to be upgraded in first place (with CPUTYPE=p3). -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: fuword(), suword(), etc.
On Wed, Jul 23, 2003 at 02:48:41PM -0700, Julian Elischer wrote: + I'd like to have a suptr and fuptr to be able to save and read + user pointers in a machine independent manner.. + at the moment ia need to know the size of a pointer and select the + appropriate 32 or 64 version.. It would jus tbe another ENTRY files in + support.[sS] alongside teh appropriate sized entry + for each architecture so it wouldn't 'cost' anything.. + + for i386 it would be an alternate name for fuword32() and suword32() + I'm not sure what it would be on other architectures + + comments? Yes, good idea. I'm using for now something like this: static __inline void * fuptr(void *uaddr) { void *ptr; if (copyin(uaddr, ptr, sizeof(void *)) != 0) return ((void *)-1); return (ptr); } For numbers is always better to use copyin(9)/copyout(9). Functions like fubyte(9), etc. make no sens for me. -1 is returned on error or if there is really -1, so one isn't able to find out if there is an error or not. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
File system deadlock. GBDE(4) and/or MD(4) related.
Hello. I've found deadlock in gbde(4) and/or md(4). Here is a complete procedure hot to repeat it: # touch /mnt/test.file # mdconfig -a -t vnode -f /mnt/test.file -s 512M -u 1 # mkdir /etc/gbde # gbde init /dev/md1 -L /etc/gbde/md1 Enter new passphrase: Reenter new passphrase: Wrote key 0 at 25444352 # gbde attach md1 -l /etc/gbde/md1 Enter passphrase: # newfs -U -O2 /dev/md1.bde /dev/md1.bde: 496.5MB (1016768 sectors) block size 16384, fragment size 2048 using 4 cylinder groups of 124.12MB, 7944 blks, 15936 inodes. with soft updates super-block backups (for fsck -b #) at: 160, 254368, 508576, 762784 # mkdir /mnt/test # mount /dev/md1.bde /mnt/test # cp -R /usr/src /mnt/test [ wait about 10 seconds ] # ls /mnt/te[TAB] or # sync;sync -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: File system deadlock. GBDE(4) and/or MD(4) related.
On Thu, Jul 24, 2003 at 03:57:07PM +0200, Pawel Jakub Dawidek wrote: + I've found deadlock in gbde(4) and/or md(4). Yes, it is gbde fault: db show lockedvnods [...] 0xc3332920: tag ufs, type VREG, usecount 1, writecount 0, refcount 21, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc2ca5000 ino 8214, on dev md0.bde (4, 23) 0xc33325b4: tag ufs, type VREG, usecount 2, writecount 1, refcount 25, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc2ec3980 ino 8217, on dev md0.bde (4, 23) [...] Look at refcounts. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: File system deadlock. GBDE(4) and/or MD(4) related.
On Thu, Jul 24, 2003 at 05:03:23PM +0200, Poul-Henning Kamp wrote: + # touch /mnt/test.file + + You are probably missing: + + dd if=/dev/null of=/mnt/test.file bs=1m count=512 You mean /dev/zero? But this doesn't change anything. + # mdconfig -a -t vnode -f /mnt/test.file -s 512M -u 1 + + What you have found has nothing to do with GBDE, I think it is the + usual vnode backed md(4) deadlock. Hmm? So you're trying to tell that this is somehow normal behaviour? -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: File system deadlock. GBDE(4) and/or MD(4) related.
On Thu, Jul 24, 2003 at 08:38:02PM +0200, Poul-Henning Kamp wrote: + + What you have found has nothing to do with GBDE, I think it is the + + usual vnode backed md(4) deadlock. + + Hmm? So you're trying to tell that this is somehow normal behaviour? + + We've had problems like this before with vnode backed MD(4) devices + (and vn(4) devices before that). + + One way or another: It is _not_ a GBDE problem. Hey, Poul! I'm not trying to show that gbde(4) is a buggy software, I'm not trying to destroy you work, your image or FreeBSD, really. I believe that this isn't bug in gbde(4), my fault, sorry. But one thing I know, is that bug is somewhere and I just want to help track it down. This information could be useful: When I've mounted file system on /private (not on /mnt/private) there is no problem anymore. So maybe deadlock is caused by some directory locking or something? Because if file system in mounted on /mnt/private deadlock is 100% reproducable. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: possible unionfs bug
On Sun, Jul 20, 2003 at 03:02:21PM +0200, Divacky Roman wrote: + I might be wrong but this: + + free(mp-mnt_data, M_UNIONFSMNT); /* XXX */ + mp-mnt_data = 0; + + seems to me wrong and might cause crashes etc. + am I correct or wrong? Could you describe scenario when this could be dangerous? Or why do you think it is? This memory is allocated while mounting unionfs file system, so it is quite natural to free this memory while unmounting file system. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: Screen saver: bsd_saver.
On Fri, Jun 27, 2003 at 09:20:03AM +, Bosko Milekic wrote: + Hello there. + + I've wrote screen saver for FreeBSD 5.x with rotating bsd logo. + +http://garage.freebsd.pl/bsd_saver.tbz + + Any chance to add it to tree? + + I don't know whether it works or not, but this contains + floating point instruction, which is hardly used and needs cafeful + treatment. (As far as I know, FP instruction is used only on + i586_bcopy) What do you think about it? + + FWIW, I've tested this yesterday and wanted to commit it but + shamefully I must admit that I don't know how to properly prepare a + port. The screen saver works and is pretty neat although I had to + build in low video mode. Andrew Kenneth Milton [EMAIL PROTECTED] suggest me to automatically turn on low video mode if there is no chance to turn on high and to automaticly load vesa.ko if required. I think, that those suggestion are good and I'll implement them. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: Screen saver: bsd_saver.
On Fri, Jun 27, 2003 at 03:35:19PM +0200, Pawel Jakub Dawidek wrote: + + FWIW, I've tested this yesterday and wanted to commit it but + + shamefully I must admit that I don't know how to properly prepare a + + port. The screen saver works and is pretty neat although I had to + + build in low video mode. + + Andrew Kenneth Milton [EMAIL PROTECTED] suggest me to automatically + turn on low video mode if there is no chance to turn on high and to + automaticly load vesa.ko if required. + I think, that those suggestion are good and I'll implement them. Ok. Done. http://garage.freebsd.pl/bsd_saver.tbz I'm not able to add depency on vesa module without this patch: http://garage.freebsd.pl/vesa.patch So for now it will try to run on 1024x768 screen, then 800x600 and at the end on 320x200. If vesa will be loaded it should run on 1024x768 and if not on 320x200. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Screen saver: bsd_saver.
Hello there. I've wrote screen saver for FreeBSD 5.x with rotating bsd logo. http://garage.freebsd.pl/bsd_saver.tbz Any chance to add it to tree? Thanks. -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
3d screen saver.
Hello:) I want to present one-night-hack: 3d CERB logo screen saver. It is dedicated for FreeBSD 5.x and it is quite nice (IMHO). You can download it from: http://garage.freebsd.pl/cerb_saver.tbz Enjoy! -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: 5.1-RELEASE panic, trace included
On Sat, Jun 14, 2003 at 02:28:33AM -0400, Robert Watson wrote: + If you have the kernel.debug for this kernel, could you send the gdb -k + output of: + + l *in6_pcbbind+0x2a7 I've looked at objdump -d kernel, and it looks like this is somewhere here: 214:t = in_pcblookup_local(pcbinfo, 215:sin.sin_addr, lport, 216:INPLOOKUP_WILDCARD); 217:if (t 218:(so-so_cred-cr_uid != 219: t-inp_socket-so_cred-cr_uid) 220:(ntohl(t-inp_laddr.s_addr) != 221: INADDR_ANY || 222: INP_SOCKAF(so) == 223: INP_SOCKAF(t-inp_socket))) 224:return (EADDRINUSE); We're talking about this line: test%eax,%eax je c03ac9c7 in6_pcbbind+0x2e7 mov 0x64(%eax),%eax mov %eax,0xffd0(%ebp) = mov 0xc4(%eax),%edx mov 0xc4(%esi),%eax mov 0x4(%eax),%eax cmp 0x4(%edx),%eax je c03ac9c7 in6_pcbbind+0x2e7 We're loading inp_socket-so_cred to edx here. So it looks like inp_socket is NULL. Hmm, it is possible? -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
pam_unix.c [PATCH].
Hello. I think there is no need to open a PR for this. Argument 'flags' marked as unused is used in those functions: --- pam_unix.c.orig Wed May 28 23:31:54 2003 +++ pam_unix.c Wed May 28 23:32:40 2003 @@ -95,8 +95,7 @@ * authentication management */ PAM_EXTERN int -pam_sm_authenticate(pam_handle_t *pamh, int flags __unused, -int argc, const char *argv[]) +pam_sm_authenticate(pam_handle_t *pamh, int flags, int argc, const char *argv[]) { login_cap_t *lc; struct options options; @@ -159,8 +158,7 @@ * account management */ PAM_EXTERN int -pam_sm_acct_mgmt(pam_handle_t *pamh, int flags __unused, -int argc, const char *argv[]) +pam_sm_acct_mgmt(pam_handle_t *pamh, int flags, int argc, const char *argv[]) { struct addrinfo hints, *res; struct options options; -- Pawel Jakub Dawidek [EMAIL PROTECTED] UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net pgp0.pgp Description: PGP signature
Re: 5-STABLE Roadmap
On Thu, Feb 13, 2003 at 08:28:43PM -0800, Sam Leffler wrote: + This can quickly turn into a bikeshed, but suggest ones. We're looking for + good benchmarks. [...] Look at: http://www.web-polygraph.org It provides tests for www-cache/proxy stuff. We can test many things with it: - how fast could we generate workload, - how heavy load could we handle, - how fast is squid running on FreeBSD, - how fast is squid rewritten with libkse, - etc. And this is good stablility test. This is real good and free stuff, I use it on 4.x. -- Pawel Jakub Dawidek UNIX Systems Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am. msg52484/pgp0.pgp Description: PGP signature
Re: 5-STABLE Roadmap
On Sun, Feb 16, 2003 at 02:08:35PM -0700, Scott Long wrote: + Pawel Jakub Dawidek wrote: + + On Thu, Feb 13, 2003 at 08:28:43PM -0800, Sam Leffler wrote: + + This can quickly turn into a bikeshed, but suggest ones. We're + looking for + + good benchmarks. [...] + + Look at: + + http://www.web-polygraph.org + + It provides tests for www-cache/proxy stuff. + We can test many things with it: + + - how fast could we generate workload, + - how heavy load could we handle, + - how fast is squid running on FreeBSD, + - how fast is squid rewritten with libkse, + - etc. + + And this is good stablility test. + This is real good and free stuff, I use it on 4.x. + + Thanks for the pointer, this looks very interesting. How hard + is it to set up? [...] Setting it up is quite simple, but it doesn't compile with gcc 3.x... Authors of this stuff proposing to use it with FreeBSD 4.x, so it is well tested on out favorite system:) + [...] DO you have any test configuations and/or + scripts that we could adapt? Yes, on website kernel patches are avaliable for tunning, but for new releases of 4.x this isn't necessary, all could be configure with kernel options and sysctls (for 4.8): options MAXFILES=16384 options HZ=1000 options NMBCLUSTERS=32678 kern.ipc.somaxconn=1024 net.inet.ip.portrange.last=4 net.inet.tcp.delayed_ack=0 net.inet.tcp.msl=3000 Rest is quite simple/well documented. Tests in theory could be run on one machine, so... And some nice looking results generated by web-polygraph: Without any proxy: http://garage.freebsd.pl/pm3-15-11-2k2 With squid: http://garage.freebsd.pl/pm3-05-11-2k2 http://garage.freebsd.pl/pm3-06-11-2k2 With external proxy: http://garage.freebsd.pl/pm3-29-01-2k3 PS. I'm CC-ing this thread to one of polygraph's authors, he could be interested as well. -- Pawel Jakub Dawidek UNIX Systems Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am. msg52494/pgp0.pgp Description: PGP signature
LOR: if_ether.c - route.c.
Hello. We got lock order reversal here: 1st 0xc0384800 arp mutex (arp mutex) @ /usr/src/sys/netinet/if_ether.c:151 2nd 0xc1886b7c radix node head (radix node head) @ /usr/src/sys/net/route.c:549 Simple backtrace: rtreqest1() [route.c] rtreqest() [route.c] arptfree() [if_ether.c] arptimer() [if_ether.c] -- Pawel Jakub Dawidek UNIX Systems Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am. msg51659/pgp0.pgp Description: PGP signature
Re: LOR: if_ether.c - route.c.
On Mon, Feb 03, 2003 at 12:06:28PM +0100, Pawel Jakub Dawidek wrote: + We got lock order reversal here: + + 1st 0xc0384800 arp mutex (arp mutex) @ /usr/src/sys/netinet/if_ether.c:151 + 2nd 0xc1886b7c radix node head (radix node head) @ /usr/src/sys/net/route.c:549 + + Simple backtrace: + rtreqest1() [route.c] + rtreqest() [route.c] + arptfree() [if_ether.c] + arptimer() [if_ether.c] I think that MTX_DUPOK is needed here, so: --- radix.h.origSun Feb 2 20:07:42 2003 +++ radix.h Mon Feb 3 21:48:30 2003 @@ -159,7 +159,7 @@ #define RADIX_NODE_HEAD_LOCK_INIT(rnh) \ -mtx_init((rnh)-rnh_mtx, radix node head, NULL, MTX_DEF | MTX_RECURSE) +mtx_init((rnh)-rnh_mtx, radix node head, NULL, MTX_DEF | MTX_RECURSE | +MTX_DUPOK) #define RADIX_NODE_HEAD_LOCK(rnh) mtx_lock((rnh)-rnh_mtx) #define RADIX_NODE_HEAD_UNLOCK(rnh)mtx_unlock((rnh)-rnh_mtx) #define RADIX_NODE_HEAD_DESTROY(rnh) mtx_destroy((rnh)-rnh_mtx) Am I right? radix node head is locked first time in arptimer() and 2nd in rtrequest1(). And (if I understand code well) those locks should be in both functions, because rtrequest1() is not only called through arptimer(), but also through other functions that don't lock it eariler. -- Pawel Jakub Dawidek UNIX Systems Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am. msg51703/pgp0.pgp Description: PGP signature
Lock order reversal in if_tl device.
Hello... tl0: device timeout lock order reversal 1st 0xc0331020 ifnet (ifnet) @ /usr/src/sys/net/if.c:1181 2nd 0xc2576ab8 tl0 (network driver) @ /usr/src/sys/pci/if_tl.c:2067 Driver is loaded via kld module from /boot/loader.conf. To get interface up I need to unload and load module again. Here is my kernel config and dmesg.boot: http://prioris.mini.pw.edu.pl/~nick/dmesg.boot http://prioris.mini.pw.edu.pl/~nick/SLAYER -- Pawel Jakub Dawidek UNIX Systems Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -CURRENT panic on SMP Athlon box.
On Sat, Dec 21, 2002 at 08:51:27AM +0100, Poul-Henning Kamp wrote: + My SMP Athlon box paniced again tonight, and this time my serial + console caught it in the act. + + I have no idea what has caused this, and have no idea if it has any + significance for 5.0-R or not. I wonder if we have a memory leak ? Maybe a good way to debug it is to show memory statistics just like sysctl kern.malloc do, befor this panic (or any panic caused by insufficient memory) is called? -- Pawel Jakub Dawidek UNIX Systems Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am. msg49156/pgp0.pgp Description: PGP signature
Panic in jail [patch].
Hello. Initiated mutex for prison isn't destroyed on error. Kernel will on every error. Here You got patch for this: --- kern_jail.c.origFri Dec 20 15:11:10 2002 +++ kern_jail.c Fri Dec 20 15:14:03 2002 @@ -103,6 +103,7 @@ PROC_UNLOCK(p); crfree(newcred); bail: + mtx_destroy(pr-pr_mtx); FREE(pr, M_PRISON); return (error); } --- BTW. Maybe is time to implement jail with more features? Multiple ips, protecting statfs-like calls or even multi level jail? As multi level jail I understand jail created in jail, etc. -- Pawel Jakub Dawidek UNIX Systems Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am. msg49120/pgp0.pgp Description: PGP signature