On Fri, Oct 18, 2013 at 10:32 AM, Brian Behlendorf <[email protected]>wrote:
> On 10/14/13 13:22, Matthew Ahrens wrote: > >> I'm mostly going on the commit messages for the moment. Comments on >> each commit below: >> > > We should decide on some procedure for getting these changes merged back > to Illumos. Shall we start by opening an issue for each patch on the > Illumos tracker and link to the proposed patch for further discussion. > I would recommend that we first select the set of these that we agree are ready to integrate. By starting with the least contentious issues we will hopefully not get bogged down. Once that's done then we can go back for the 2nd round. I've marked the issues the I propose we do first below. For each of these commits, we should do the following: 1. Open an issue on the illumos tracker. 2. Apply the patch to illumos 3. Generate a webrev and post a code review request to [email protected] 4. Compile & test on illumos (this could be done on a combined wad with several fixes) I or someone at Delphix can help you with steps 3 and (especially) 4 if you'd like. You could start by posting the patches against illumos on github. If you'd like help with steps 1 and 2, you might post a summary of just the commits that we agree are ready, and solicit help on [email protected] [email protected]. --matt > > commit 86dd0fd9222b6103c6533036c47b90**8ece944460 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Sun Aug 19 17:17:02 2012 -0700 >> >> Pre-allocate vdev I/O buffers >> >> >> How do you know how many buffers to pre-allocate? I see the code using >> zfs_vdev_max_pending, but that can be changed dynamically. There's also >> code to fall back on kmem_cache_alloc() if the list is empty, which >> would seem to negate any deadlock fix. >> > > This wasn't done to avoid a classical deadlock. It was done to mitigate > lock contention in the Linux VM code. By pre-allocating a small number of > buffers we can avoid the allocation here for the common case. But in the common case, the allocator doesn't need to get more pages either. Perhaps we should create a kmem cache specific to this use case, so that if other callers exhaust the generic zio_buf cache it doesn't affect this code path? Also, line vdev_queue.c line 319 looks like you are assuming size < pagesize, since vi is a pagesize buffer from zio_vdev_alloc(). Seems like it needs to be zfs_vdev_aggregation_limit (plus assertions to make sure that someone didn't change zfs_vdev_aggregation_limit at runtime). > In the unlikely event all pre-allocated buffers are in use it falls back > to allocating a new one. > > This issue was observed in ZoL when using a zvol for the systems swap > device. Under low memory conditions the system would attempt to swap to > the zvol and these allocations would cause additional memory pressure. The > system would then thrash making very little headway but never actually > deadlocking > > For Linux filesystems in general significant care is usually taken to > minimize the number and size of allocations in the write path. The number > of allocations ZFS performs in its write path has in general been a problem > for us under Linux. In fact, even doing vmalloc() in the write path may > deadlock. This is one of the major reasons we're so keen on moving to > scatter-gather lists. > > > Are you sure that the problem can not be fixed by using KM_PUSHPAGE for >> these allocations? >> > > This is also needed but not sufficient by itself. > > >> commit c93504f03a0881992689069a8f78e1**7933dcd5b3 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Wed Jul 24 09:57:56 2013 -0700 >> >> Change l2arc_norw default to zero >> >> fine by me >> > Let's leave this out of the first set of changes. It's been discussed on the illumos mailing list (see subject "UPDATED: Review request #3995 and #3997") with some dissenters. Let them sort it out. >> commit cb543e6b5e98546a5caec29ca4b25a**bec98560a2 >> Author: Richard Yao <[email protected] <mailto:[email protected]>> >> >> Date: Tue Jul 9 10:45:30 2013 -0400 >> >> Remove b_thawed from arc_buf_hdr_t >> >> >> This variable is used on illumos to track when a buffer was thawed. >> When kmem_flags=0xf, the allocator records the stack trace when the >> allocation (or free) happened. This could be made more clear with the >> use of appropriately-named wrappers. >> > > So this is mdb specific debugging then. Adding wrappers with proper > comments for this would helpful for the readability of the code. It would > also allow us to easily compile it out under Linux. > Agreed. > > >> commit 3f4058cd15545a83b4e6e50cd7e29a**f45b54054a >> Author: Richard Yao <[email protected] <mailto:[email protected]>> >> >> Date: Sun Jul 7 13:39:59 2013 -0400 >> >> Remove arc_data_buf_alloc()/arc_data_**buf_free() >> >> looks good to me. >> > Include in first set of changes. > >> commit 92334b14ec378b1693573b52c09816**bbade9cf3e >> Author: Prakash Surya <[email protected] <mailto:[email protected]>> >> >> Date: Wed Jul 10 15:53:48 2013 -0700 >> >> Add new kstat for monitoring time in dmu_tx_assign >> >> Neat idea. I would suggest separating out the histogram-in-a-kstat code >> from this particular use of it. >> > > Good idea. We'll probably do this if/when there's a second consumer. > > >> commit 50210587563bb37c48d2624d11e158**ab3ad30715 >> Author: Tim Chase <[email protected] <mailto:[email protected]>> >> >> Date: Tue Jul 9 07:15:26 2013 -0500 >> >> zdb: enhancement - Display SA xattrs. >> >> SA xattrs are a linux-only on-disk format extension. How is the on-disk >> compatibility of these handled? What happens if I move a pool from >> linux to another OS that doesn't know about this representation? >> > > This is something I'd like to talk about at OpenZFS Developer Summit. This > new feature might be the first use case for feature flags at the dataset > level. Unfortunately, right now we only have pool level feature flags. We > should hash out a plan to handle this type of feature going forward. > Sounds great. > > If you use this feature on Linux you will still be able to import your > pool on other platforms and mount the filesystems. However, xattrs which > have been stored as SAs will not be visible. > > >> commit b4f7f105275d996fbcb6abd6576030**7d2153a89b >> Author: Ying Zhu <[email protected] <mailto:[email protected]* >> *>> >> >> Date: Sat Jun 29 15:03:49 2013 +0800 >> >> Improve code in arc_buf_remove_ref >> >> >> This code does not do what the commit comment says it does. HDR_LOCK() >> does not acquire a lock. >> > > Agreed, it's a bad comment. > > >> commit 64d7b6cf75e52a4698d9bdec617455**73c39d2e5a >> Author: Cyril Plisko <[email protected] <mailto: >> cyril.plisko@mountall.**com <[email protected]>>> >> >> Date: Mon Jun 24 09:45:20 2013 +0300 >> >> Override default SPA config location via environment >> >> looks good to me. >> > Include in first set of changes. > >> commit e2e229eb180fce8a67ba62c0e404d4**7e82c4c24d >> Author: Steven Burgess <[email protected] <mailto: >> sburgess@dattobackup.**com <[email protected]>>> >> >> Date: Thu Jun 27 20:35:11 2013 -0400 >> >> Formating changes for zpool manpage >> >> looks good to me >> > Include in first set of changes. > >> Fix incorrect assertions in ddt_phys_decref and ddt_sync_entry >> >> I didn't realize that hole blocks showed up in the dedup table. Might >> want to double check that. But the change sounds good anyway. >> >> commit 937210a54b9c2d3dddc7221e31d569**5e9720a055 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Wed May 1 09:38:49 2013 -0700 >> >> Fix zinject list handlers >> >> unclear if this is applicable to illumos, according to commit message. >> > > This actually brings us back in sync with Illumos, so there's no need to > pull it in. > > >> commit 5dc6af0eec29b119b731c793037fd7**7214fc9438 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Tue Mar 19 12:05:08 2013 -0700 >> >> Add zio_ddt_free()+ddt_phys_**decref() error handling >> >> >> I don't think we should drive on blindly when on-disk corruption is >> detected. Use zfs_panic_recover() for this. >> > > Good thought, minimally we should be logging the event. > > >> commit 2016ff96d1739b5ced1d37e7266720**e7531b8212 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Fri Feb 22 11:28:28 2013 -0800 >> >> Fix zdb.8 macro warning >> >> >> looks good. >> > Include in first set of changes. > >> commit 0b4d1b5853791e1e447d74f0b22980**0d65b53071 >> Author: Eric Dillmann <[email protected] <mailto:[email protected]>> >> >> Date: Thu Feb 14 00:11:59 2013 +0100 >> >> Add snapdev=[hidden|visible] dataset property >> >> >> Idea looks good, but to preserve existing semantics, the default should >> be "visible" (may not be an issue on Linux if the default there was >> always "hidden"). >> > > The default on Linux needs to be 'hidden'. If it's not and the system > contains thousands of zvol snapshots. Then substanial IO will occur as the > kernel reads each parition table and udev automatically probes every > partition for a filesystem. When automatic snapshotting is enabled it's > easy to accidentally end up in this situation. > Include in first set of changes (with default of "visible" on illumos). > > >> commit b01615d5ac86913da1e092d0378bfb**8f0e72af30 >> Author: Richard Yao <[email protected] <mailto: >> [email protected]**>> >> >> Date: Thu Feb 14 23:37:43 2013 -0500 >> >> Constify structures containing function pointers >> >> >> seems fine. >> > Include in first set of changes. > >> commit dd26aa535b395735ca61ea2a3e618a**ded45eb05e >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Mon Feb 4 16:35:54 2013 -0800 >> >> Cast 'zfs bad bloc' to ULL for x86 >> >> good fix. >> > Include in first set of changes. > >> commit f52b31eaf0301feeb308fbc46f696e**b44d0ae523 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Thu Jan 31 11:02:21 2013 -0800 >> >> Honor 80 character limit in 'zpool status' >> >> looks good. >> > Include in first set of changes. > >> commit 14ee71efbc28086406bb255f2292b9**535d845625 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Mon Jan 28 09:53:51 2013 -0800 >> >> Use strerror() not strerror_r() >> >> so libzfs can't be used from a multi-threaded program? That doesn't >> seem like a great idea. >> > > I agree this isn't good. But strerror() is already used extensively in > libzfs and this is no worse than the other call sites. In this regard > libzfs has never been totally thread safe. If someone wants to portably > fix all the existing callers that would be great. Until then changing this > so at least everything is consistent doesn't seem so bad. > Include in first set of changes. > > >> commit 38145d612963d0a5b80017d5d1d49c**1d4f9637c2 >> Author: Darik Horn <[email protected] <mailto:[email protected]>> >> >> Date: Mon Jan 14 19:27:39 2013 -0600 >> >> Ensure that zfs diff prints unicode safely. >> >> ok. >> > Include in first set of changes. > >> commit 15313c5e1866e81e2f4a30d2c50b43**b5435e547a >> Author: Dominik Honnef <[email protected] <mailto: >> [email protected]**>> >> >> Date: Fri Jan 4 20:09:20 2013 +0100 >> >> Fix duplicate words in zpool.8 >> >> >> looks good. >> > Include in first set of changes. > >> commit c3275b56a1470ed255441df6ff105d**0c3c095d8b >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Fri Nov 30 11:23:38 2012 -0800 >> >> Add load_nvlist() error handling >> >> looks good >> > Include in first set of changes. > >> commit 30315d237bb23226476b348bc59158**9c80597351 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Tue Nov 27 13:32:57 2012 -0800 >> >> Increase ZFS_OBJ_MTX_SZ to 256 >> >> ok. >> > Include in first set of changes. > >> >> commit 9dcb97198338ba2d8764dd5604b278**118612f74d >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Thu Oct 25 13:02:31 2012 -0700 >> >> Log I/Os longer than zio_delay_max (30s default) >> >> seems reasonable; I'd want to get input from others who are more >> familiar with FMA. >> > > At the time this was added simply for debugging. However, I would like to > have a broader talk about FMA and what the situation is on the various > platforms. > > >> commit e95853a331529a6cb96fdf10476c53**441e59f4e1 >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Tue Oct 23 13:48:22 2012 -0700 >> >> Add txgs-<pool> kstat file >> >> >> I'm not sure that kstats are the best way to export this info. Will >> need to investigate more. >> > > They're probably not. Although it's the only half way portable interface > we have for this sort of thing. There has been some recent churn is how we > use kstats on the Linux side to maximize portability. But until that gets > merged in to the Linux tree I wouldn't worry to much about this. > > Thanks, > Brian > > >> commit e8fd45a0f975c6b8ae8cd644714fc2**1f14fac2bf >> Author: Brian Behlendorf <[email protected] <mailto: >> [email protected]>> >> >> Date: Fri Oct 26 10:01:49 2012 -0700 >> >> Add ddt_object_count() error handling >> >> ok >> > Include in first set of changes. > >> >> --matt >> >> >> >> >> On Fri, Oct 11, 2013 at 9:48 PM, Richard Yao <[email protected] >> <mailto:[email protected]>> wrote: >> >> Dear Everyone, >> >> I did a cursory review of changes to the ZFSOnLinux repository made >> during the past 12 months and identified a number of commits that >> probably should be sent to Illumos. >> >> I have attached a preliminary list. I did a really quick review, so I >> expect that there are a few false positives included and probably a >> few >> false negatives excluded. >> >> I can do a more thorough review to produce a comprehensive list, but >> the >> important thing is to work out the process to get the improvements >> made >> in ZFSOnLinux either merged back into Illumos or rejected with some >> kind >> of explanation why so that appropriate changes can be made in >> ZFSOnLinux >> to reduce code differences. >> >> Does anyone have any thoughts? >> >> Yours truly, >> Richard Yao >> >> >> To unsubscribe from this group and stop receiving emails from it, send >> an email to >> zfs-devel+unsubscribe@**zfsonlinux.org<zfs-devel%[email protected]> >> . >> >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
