Re: my build time impact of clang 5.0

2017-10-02 Thread Andy Farkas

On 03/10/2017 06:18, Dan Mack wrote:


My scripts are pretty coarse grained so I only have timings at the macro
build steps so far (buildworld, buildkernel, installkernel, and
installworld)  I'm going to update them so I can a little more
granularity; should be easy to get timings wrapped around the big
sections, for example:

  >>> World build started on Mon Oct  2 07:49:56 CDT 2017
  >>> Rebuilding the temporary build tree
  >>> stage 1.1: legacy release compatibility shims
  >>> stage 1.2: bootstrap tools
  >>> stage 2.1: cleaning up the object tree
  >>> stage 2.2: rebuilding the object tree
  >>> stage 2.3: build tools
  >>> stage 3: cross tools
  >>> stage 3.1: recording compiler metadata
  >>> stage 4.1: building includes
  >>> stage 4.2: building libraries
  >>> stage 4.3: building everything
  >>> stage 5.1: building lib32 shim libraries
  >>> World build completed on Mon Oct  2 12:30:02 CDT 2017

Dan



Perhaps you could hack src/tools/tools/whereintheworld/whereintheworld.pl

-andyf

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: my build time impact of clang 5.0

2017-10-02 Thread Dan Mack
Mike Tancsa  writes:

> On 10/2/2017 2:34 PM, Dan Mack wrote:
>>
>> Another significant change in build times this week - not complaining,
>> just my observations on build times; same server doing buildworld during
>> the various phases of compiler changes over the last year or so FWIW:
>
> Kernel seems to be about the same since 4.x  Perhaps the added
> buildworld time is due to a larger feature set of clang 5.x  and hence
> takes longer to build itself ?  e.g. more platforms supported etc ?

My scripts are pretty coarse grained so I only have timings at the macro
build steps so far (buildworld, buildkernel, installkernel, and
installworld)  I'm going to update them so I can a little more
granularity; should be easy to get timings wrapped around the big
sections, for example:

 >>> World build started on Mon Oct  2 07:49:56 CDT 2017
 >>> Rebuilding the temporary build tree
 >>> stage 1.1: legacy release compatibility shims
 >>> stage 1.2: bootstrap tools
 >>> stage 2.1: cleaning up the object tree
 >>> stage 2.2: rebuilding the object tree
 >>> stage 2.3: build tools
 >>> stage 3: cross tools
 >>> stage 3.1: recording compiler metadata
 >>> stage 4.1: building includes
 >>> stage 4.2: building libraries
 >>> stage 4.3: building everything
 >>> stage 5.1: building lib32 shim libraries
 >>> World build completed on Mon Oct  2 12:30:02 CDT 2017

Dan

>> -STABLE amd64
>> |--+--+---+--+---|
>> | Ver (svn-id) | World (mins) | Kernel (mins) | Relative | Comment   |
>> |--+--+---+--+---|
>> |   292733 |   90 |16 |  0.5 |   |
>> |   299948 |   89 |16 |  0.5 |   |
>> |   322724 |  174 |21 |  1.0 | clang 4.x |
>> |   323310 |  175 |21 |  1.0 | clang 4.x |
>> |   323984 |  175 |21 |  1.0 | clang 4.x |
>> |   324130 |  285 |21 |  1.6 | clang 5.x |
>> |   324204 |  280 |21 |  1.6 | clang 5.x |
>> |--+--+---+--+---|
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: my build time impact of clang 5.0

2017-10-02 Thread Mike Tancsa
On 10/2/2017 2:34 PM, Dan Mack wrote:
> 
> Another significant change in build times this week - not complaining,
> just my observations on build times; same server doing buildworld during
> the various phases of compiler changes over the last year or so FWIW:

Kernel seems to be about the same since 4.x  Perhaps the added
buildworld time is due to a larger feature set of clang 5.x  and hence
takes longer to build itself ?  e.g. more platforms supported etc ?


> 
> |--+--+---+--+---|
> | Ver (svn-id) | World (mins) | Kernel (mins) | Relative | Comment   |
> |--+--+---+--+---|
> |   292733 |   90 |16 |  0.5 |   |
> |   299948 |   89 |16 |  0.5 |   |
> |   322724 |  174 |21 |  1.0 | clang 4.x |
> |   323310 |  175 |21 |  1.0 | clang 4.x |
> |   323984 |  175 |21 |  1.0 | clang 4.x |
> |   324130 |  285 |21 |  1.6 | clang 5.x |
> |   324204 |  280 |21 |  1.6 | clang 5.x |
> |--+--+---+--+---|
> 
> Dan
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 
> 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


my build time impact of clang 5.0

2017-10-02 Thread Dan Mack

Another significant change in build times this week - not complaining,
just my observations on build times; same server doing buildworld during
the various phases of compiler changes over the last year or so FWIW:

|--+--+---+--+---|
| Ver (svn-id) | World (mins) | Kernel (mins) | Relative | Comment   |
|--+--+---+--+---|
|   292733 |   90 |16 |  0.5 |   |
|   299948 |   89 |16 |  0.5 |   |
|   322724 |  174 |21 |  1.0 | clang 4.x |
|   323310 |  175 |21 |  1.0 | clang 4.x |
|   323984 |  175 |21 |  1.0 | clang 4.x |
|   324130 |  285 |21 |  1.6 | clang 5.x |
|   324204 |  280 |21 |  1.6 | clang 5.x |
|--+--+---+--+---|

Dan
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-02 Thread Harry Schmalzbauer
Bezüglich Andriy Gapon's Nachricht vom 02.10.2017 13:49 (localtime):
> On 01/10/2017 00:38, Harry Schmalzbauer wrote:
>> Now my striped mirror has all 4 devices healthy available, but all
>> datasets seem to be lost.
>> No problem for 450G (99,9_%), but there's a 80M dataset which I'm really
>> missing :-(
> 
> If it's not too late now, you may try to experiment with an "unwind" / 
> "extreme
> unwind" import using -F -n / -X -n.  Or manually specifying a txg number for
> import (in read-only mode).

Thanks for your reply!

I had dumped one of each mirror's drive and attaching it as memory disk
works as intended.
So "zfs import" offers me the corrupt backup (on the host with a already
recreated pool).

Unfortunately my knowledge about ZFS internals (transaction group number
relations to (ü)uberblocks) doesn't allow me to follow your hint.

How can I determine the last txg#, resp. the ones before the last?
I guess 'zpool import -t' is the tool/parameter to use.
ZFS has wonderful documentation, but although this was a perfect reason
to start learning the details about my beloved ZFS, I don't have the
time to.

Is there a zdb(8) aequivalent of 'zpool import -t', so I can issue the
zdb check, wich doesn't crash the kernel but only zdb(8)?

For regular 'zpool import', 'zdb -ce' seems to be such a synonym. At
least the crash report is identical, see my reply to Scott Bennett's post..

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-02 Thread Harry Schmalzbauer
Bezüglich Scott Bennett's Nachricht vom 01.10.2017 15:20 (localtime):
>  On Sat, 30 Sep 2017 23:38:45 +0200 Harry Schmalzbauer 
> 
> wrote:

…
>>
>> OpenIndiana also panics at regular import.
>> Unfortunately I don't know the aequivalent of vfs.zfs.recover in OI.
>>
>> panic[cpu1]/thread=ff06dafe8be0: blkptr at ff06dbe63000 has
>> invalid CHECKSUM 1
>>
>> Warning - stack not written to the dump buffer
>> ff001f67f070 genunix:vcmn_err+42 ()
>> ff001f67f0e0 zfs:zfs_panic_recover+51 ()
>> ff001f67f140 zfs:zfs_blkptr_verify+8d ()
>> ff001f67f220 zfs:zio_read+55 ()
>> ff001f67f310 zfs:arc_read+662 ()
>> ff001f67f370 zfs:traverse_prefetch_metadata+b5 ()
>> ff001f67f450 zfs:traverse_visitbp+1c3 ()
>> ff001f67f4e0 zfs:traverse_dnode+af ()
>> ff001f67f5c0 zfs:traverse_visitbp+6dd ()
>> ff001f67f720 zfs:traverse_impl+1a6 ()
>> ff001f67f830 zfs:traverse_pool+9f ()
>> ff001f67f8a0 zfs:spa_load_verify+1e6 ()
>> ff001f67f990 zfs:spa_load_impl+e1c ()
>> ff001f67fa30 zfs:spa_load+14e ()
>> ff001f67fad0 zfs:spa_load_best+7a ()
>> ff001f67fb90 zfs:spa_import+1b0 ()
>> ff001f67fbe0 zfs:zfs_ioc_pool_import+10f ()
>> ff001f67fc80 zfs:zfsdev_ioctl+4b7 ()
>> ff001f67fcc0 genunix:cdev_ioctl+39 ()
>> ff001f67fd10 specfs:spec_ioctl+60 ()
>> ff001f67fda0 genunix:fop_ioctl+55 ()
>> ff001f67fec0 genunix:ioctl+9b ()
>> ff001f67ff10 unix:brand_sys_sysenter+1c9 ()
>>
>> This is a important lesson.
>> My impression was that it's not possible to corrupt a complete pool, but
>> there's always a way to recover healthy/redundant data.
>> Now my striped mirror has all 4 devices healthy available, but all
>> datasets seem to be lost.
>> No problem for 450G (99,9_%), but there's a 80M dataset which I'm really
>> missing :-(
>>
>> Unfortunately I don't know the DVA and blkptr internals, so I won't
>> write a zfs_fsck(8) soon ;-)
>>
>> Does it make sense to dump the disks for further analysis?
>> I need to recreate the pool because I need the machine's resources... :-(
>> Any help highly appreciated!
>>
>  First, if it's not too late already, make a copy of the pool's cache 
> file,
> and save it somewhere in case you need it unchanged again.
>  Can zdb(8) see it without causing a panic, i.e., without importing the
> pool?  You might be able to track down more information if zdb can get you in.

Thank you very much for your help.

zdb(8) is able to get all config data, along with all dataset information.

For the records, I'll provide zdb(8) output beyond.

In the mean time I recreated the pool and the host is back to live.
Since other pools weren't affected and had plenty of space, I dumped two
of the 4 drives along with the zdb(8) -x dump, which I don't know what
it exactly dumps (all blocks accessed!?!; result is big sparse file, but
the time it took to write them down't allow them to have anything but
metadata, at best).

Attaching the two  native dumps as memory-disk works for "zpool import" :-)
To be continued as answer to Andriy Gaoon's reply from today...

>  Another thing you could try with an admittedly very low probability of
> working would be to try importing the pool with one drive of one mirror
> missing, then try it with a different drive of one mirror, and so on the minor
> chance that the critical error is limited to one drive.  If you find a case
> where that works, then you could try to rebuild the missing drive and then run
> a scrub.  Or vice versa.  This one is time-consuming, I would imagine, given

I did try, although I had no hope that this could change the picture,
since the cause of the incosistency wasn't drive related.
And as expected, I had no luck.

Dataset mos [META], ID 0, cr_txg 4, 19.2M, 6503550977762669098 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 21   128K512  05120.00  DSL directory

Dataset mos [META], ID 0, cr_txg 4, 19.2M, 6503550977762669098 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 21   128K512  05120.00  DSL directory

loading space map for vdev 1 of 2, metaslab 108 of 109 ...
error: blkptr at 0x80d726040 has invalid CHECKSUM 1

Traversing all blocks to verify checksums and verify nothing leaked ...

Assertion failed: (!BP_IS_EMBEDDED(bp) || BPE_GET_ETYPE(bp) ==
BP_EMBEDDED_TYPE_DATA), file
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c,
line 5220.
loading space map for vdev 1 of 2, metaslab 108 of 109 ...
error: blkptr at 0x80b482e80 has invalid CHECKSUM 1

Traversing all blocks to verify checksums and verify nothing leaked ...

Assertion failed: (!BP_IS_EMBEDDED(bp) || BPE_GET_ETYPE(bp) ==
BP_EMBEDDED_TYPE_DATA), file
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c,
line 5220.
loading space map for vdev 1 of 2, metaslab 108 of 109 ...
WARNING: Assertion failed: 

Hengdian Dongyang City

2017-10-02 Thread Qmqvz
RenChengxin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-02 Thread Andriy Gapon
On 01/10/2017 00:38, Harry Schmalzbauer wrote:
> Now my striped mirror has all 4 devices healthy available, but all
> datasets seem to be lost.
> No problem for 450G (99,9_%), but there's a 80M dataset which I'm really
> missing :-(

If it's not too late now, you may try to experiment with an "unwind" / "extreme
unwind" import using -F -n / -X -n.  Or manually specifying a txg number for
import (in read-only mode).

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ctld: only 579 iSCSI targets can be created

2017-10-02 Thread Edward Napierala
Thanks for the packet trace.  What happens there is that the Windows
initiator logs in, requests Discovery ("SendTargets=All"), receives the list
of targets, as expected, and then... sends "SendTargets=All" again,
instead of logging off.  This results in ctld(8) dropping the session.
The initiator then starts the Discovery session again, but this time it only
logs in and then out, without actually requesting the target list.

Perhaps you could work around this by using "discovery-filter",
as documented in ctl.conf(5)?


2017-09-22 11:49 GMT+01:00 Eugene M. Zheganin :

> Hi,
>
> Edward Tomasz Napierała wrote 2017-09-22 12:15:
>
>>
>> There are two weird things here.  First is that the error is coming from
>> ctld(8) - the userspace daemon, not the kernel.  The second is that those
>> invalid opcodes are actually both valid - they are the Text Request,
>> and the Logout Request with Immediate flag set, exectly what you'd expect
>> for a discovery session.
>>
>> Do you have a way to do a packet dump?
>>
>
> Sure. Here it is:
>
> http://enaza.ru/stub-data/iscsi-protocol-error.pcap
>
> Target IP is 10.0.2.4, initiator IP is 10.0.3.127. During the session
> captured in this file I got in messages:
>
> Sep 22 15:38:11 san1 ctld[61373]: 10.0.3.127 
> (iqn.1991-05.com.microsoft:worker296):
> protocol error: received invalid opcode 0x4
> Sep 22 15:38:11 san1 ctld[61374]: 10.0.3.127 
> (iqn.1991-05.com.microsoft:worker296):
> protocol error: received invalid opcode 0x46
>
> This error happens when the initiator is trying to connect the disk from a
> target discovered.
>
> Target is running FreeBSD 11.0-STABLE #1 r310734M where M is for
> CTL_MAX_PORTS 1024 (old verion, yup, but I have a suspicion that I still
> failed to prove that more recent version have some iSCSI vs ZFS conflict,
> but that's another story). Initiator is running Windows 7 Professional x64,
> inside a ESX virtual machine. This happens only when some unclear threshold
> is crossed, previous ~2 hundreds of initiators run Windows 7 Professional
> too.
>
> If you need any additional data/diagnostics please let me know.
>
> Eugene.
>
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"