Re: MCE: Does this look possibly like a slot issue?

2022-06-21 Thread Ultima
Completely agree with you, Rodney. The LGA on the motherboard
can be bent very easy when moving so I wanted to recommend this
last.

Larry, as Rodney mentioned, it's more or less your last option. This
is likely the CPU and not the module itself. There is still a small chance
that is motherboard/slot related, a way you can determine this is
by swapping the CPU's slot 0 <> slot 1 and seeing if the error moves.
As I mentioned though, be very cautious. I don't want you to be in a
worse-off
state.

I would reseat the problem CPU socket before swapping the CPUs.

Best regards,
Richard Gallamore

On Tue, Jun 21, 2022 at 9:06 AM Rodney W. Grimes <
freebsd-...@gndrsh.dnsmgr.net> wrote:

> >
> >
> > Swapped 2 DIMMS, now we wait for the ZFS ARC to fill and start using all
> > the memory.
>
> Depending on the results of that one thing that is often overlooked
> when trying to trouble shoot memory systems in modern Intel systems
> is the fact that the DIMM now talks directly to the CPU chip that
> has the memory controller built into it.  THUS these "slot" related
> ECC/Parity/blowup errors can actually be the CPU and/or the CPU
> socket and/or the seating of the CPU in the socket.
>
> So if the error sticks with the DIMM slot and not the DIMM
> module the next thing I would try would be a CPU chip reseat,
> including a good inspection of the socket for for a damaged
> pin.  Also look at the lands on the CPU chip itself, and you
> can even try swaping CPU chips to see if it follows the
> CPU or the socket, much as you do with a DIMM.
>
>
> >
> > On 06/20/2022 7:59 pm, Larry Rosenman wrote:
> >
> > > SuperMicro X8DTN+
> > >
> > > 2 Processors, 6-core/12-Thread. CPU: Intel(R) Xeon(R) CPU
> > > E5645  @ 2.40GHz (2400.20-MHz K8-class CPU)
> > >
> > > I'll bring it down and swap DIMMS around
> > >
> > > On 06/20/2022 7:57 pm, Ultima wrote:
> > >
> > > Hey Larry,
> > >
> > > One red flag I am seeing is that the error is being produced on
> > > the same CPU/bank with each error you have provided so far.
> > >
> > > Can you try and follow my original recommendation and swap
> > > currently installed DIMM with the problem DIMM slot and see
> > > if anything changes?
> > >
> > > Can you also provide the motherboard model? Also, do you
> > > have multiple CPUs installed in this system?
> > >
> > > Best regards,
> > > Richard Gallamore
> > >
> > > On Mon, Jun 20, 2022 at 5:41 PM Larry Rosenman  wrote:
> > >
> > > Yes and Yes.
> > >
> > > On 06/20/2022 7:37 pm, Ultima wrote:
> > >
> > > Are you sure that the module you replaced it with was good?
> > > Are you sure you replaced the correct module?
> > >
> > > Best regards,
> > > Richard Gallamore
> > >
> > > On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman  wrote:
> > >
> > > I'm seeing them constantly:
> > >
> > > root@freenas[~]# mcelog --dmi
> > > Hardware event. This is not a software error.
> > > MCE 0
> > > CPU 22 BANK 8 TSC 20aab486464a
> > > MISC ac29890200046444 ADDR ee2f6e800
> > > TIME 1655770989 Mon Jun 20 19:23:09 2022
> > > MCG status:
> > > Memory read ECC error
> > > Memory corrected error count (CORE_ERR_CNT): 1
> > > Memory transaction Tracker ID (RTId): 44
> > > Memory DIMM ID of error: 0
> > > Memory channel ID of error: 1
> > > Memory ECC syndrome: ac298902
> > > STATUS 8c41009f MCGSTATUS 0
> > > MCGCAP 1c09 APICID 34 SOCKETID 0
> > > CPUID Vendor Intel Family 6 Model 44 Step 2
> > > WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
> > > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> > > Device Locator: P2-DIMM2C
> > > Bank Locator: BANK14
> > > Manufacturer: Hyundai
> > > Serial Number: 40F3C20F
> > > Asset Tag:
> > > Part Number: HMT151R7BFR4C-H9
> > > Hardware event. This is not a software error.
> > > MCE 1
> > > CPU 22 BANK 8 TSC 296dfcc82582
> > > MISC ac29890200041381 ADDR ee2f6e800
> > > TIME 1655770989 Mon Jun 20 19:23:09 2022
> > > MCG status:
> > > Memory read ECC error
> > > Memory corrected error count (CORE_ERR_CNT): 1
> > > Memory transaction Tracker ID (RTId): 81
> > > Memory DIMM ID of error: 0
> > > Memory channel ID of error: 1
> > > Memory ECC syndrome: ac298902
> > > STATUS 8c41009f MCGSTATUS 0
&g

Re: MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Ultima
Hey Larry,

One red flag I am seeing is that the error is being produced on
the same CPU/bank with each error you have provided so far.
Can you try and follow my original recommendation and swap
currently installed DIMM with the problem DIMM slot and see
if anything changes?

Can you also provide the motherboard model? Also, do you
have multiple CPUs installed in this system?

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 5:41 PM Larry Rosenman  wrote:

> Yes and Yes.
>
>
> On 06/20/2022 7:37 pm, Ultima wrote:
>
> Are you sure that the module you replaced it with was good?
> Are you sure you replaced the correct module?
>
> Best regards,
> Richard Gallamore
>
> On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman  wrote:
>
> I'm seeing them constantly:
>
> root@freenas[~]# mcelog --dmi
> Hardware event. This is not a software error.
> MCE 0
> CPU 22 BANK 8 TSC 20aab486464a
> MISC ac29890200046444 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 44
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 1
> CPU 22 BANK 8 TSC 296dfcc82582
> MISC ac29890200041381 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 81
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 2
> CPU 22 BANK 8 TSC 2a5604a6a070
> MISC ac29890200044281
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory ECC error occurred during scrub
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 81
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 884200cf MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> Hardware event. This is not a software error.
> MCE 3
> CPU 22 BANK 8 TSC 31e141418eb8
> MISC ac29890200046a4a ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 4a
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 4
> CPU 22 BANK 8 TSC 3a014afee106
> MISC ac29890200046646 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 46
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 5
> CPU 22 BANK 8 TSC 41d1dbef1a6a
> MISC ac29890200046141 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 41
> Memory DIMM ID of error: 0
> Memory ch

Re: MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Ultima
:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 4a
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 7
> CPU 22 BANK 8 TSC 527bc27db776
> MISC ac29890200040386 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 86
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 8
> CPU 22 BANK 8 TSC 5aa4ecdd795a
> MISC ac29890200046646 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 46
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> root@freenas[~]#
>
>
> and I replaced the DIMM yesterday :(
>
>
>
> On 06/20/2022 7:19 pm, Ultima wrote:
>
> Hey Larry,
>
>  It is possible it's the motherboard itself, but it's rare. The way I
> would determine this is to swap the DIMM module with another
> populated slot on the motherboard and see if the error migrated
> to the new slot or not. Also, this error doesn't necessarily mean
> there is a problem that needs to be addressed. If you have been
> running the system for many months and you see ECC errors a
> handful of times, it can probably be safely ignored.
>
> Best regards,
> Richard Gallamore
>
> On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman  wrote:
>
> I've gotten a BUNCH of these on my TrueNAS server.  I've replaced this
> DIMM a couple of times, and still the MCE's continue.
> Is it possible it's Motherboard slot issue?
>
> Hardware event. This is not a software error.
> MCE 8
> CPU 22 BANK 8 TSC 5aa4ecdd795a
> MISC ac29890200046646 ADDR ee2f6e800
> TIME 1655762472 Mon Jun 20 17:01:12 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 46
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
>
>
>
> --
> Larry Rosenman http://www.lerctr.org/~ler
> Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>
>
> --
> Larry Rosenman http://www.lerctr.org/~ler
> Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>


Re: MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Ultima
Hey Larry,

 It is possible it's the motherboard itself, but it's rare. The way I
would determine this is to swap the DIMM module with another
populated slot on the motherboard and see if the error migrated
to the new slot or not. Also, this error doesn't necessarily mean
there is a problem that needs to be addressed. If you have been
running the system for many months and you see ECC errors a
handful of times, it can probably be safely ignored.

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman  wrote:

> I've gotten a BUNCH of these on my TrueNAS server.  I've replaced this
> DIMM a couple of times, and still the MCE's continue.
> Is it possible it's Motherboard slot issue?
>
> Hardware event. This is not a software error.
> MCE 8
> CPU 22 BANK 8 TSC 5aa4ecdd795a
> MISC ac29890200046646 ADDR ee2f6e800
> TIME 1655762472 Mon Jun 20 17:01:12 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 46
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c41009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
>
>
>
> --
> Larry Rosenman http://www.lerctr.org/~ler
> Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>
>


Re: ix SR-IOV working

2018-08-10 Thread Ultima
Hello,

This is probably a driver issue. The only way I could get sr-iov
working with the ix driver is compiling the driver provided
by Intel and loading it before boot. [1] for more details and [2]
for the driver. Have not tested the latest version and only
tested this on CURRENT. Also, there were some options that
needed to be changed before compiling to enable sr-iov, it's
pretty straightforward though if I recall correctly.


[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211062
[2]
https://downloadcenter.intel.com/download/14688/Intel-Network-Adapters-Driver-for-PCIe-10-Gigabit-Network-Connections-Under-FreeBSD

Best regards,
Richard Gallamore

On Fri, Aug 10, 2018 at 9:19 AM, Pete Wright  wrote:

>
>
> On 8/10/18 8:30 AM, Ryan Stone wrote:
>
>> How many VFs are you trying to create?  Getting ENOSPC either
>> indicates that you tried to allocate more VFs than the hardware
>> supports, or the system could not allocate enough MMIO space for the
>> VFs.
>>
>
> Hi Ryan,
> I was attempting to create a single VF.  here's my iovct.conf:
>
> PF {
> num_vfs: 1;
> device : "ix0";
> }
>
> DEFAULT {
> passthrough : true;
> }
>
> my goal is to setup several bhyve instances on this server, and allocate
> one VF per instance.  for now i'm attempting to create a single VF for
> testing purposes.
>
> Cheers!
>
> -pete
>
> --
> Pete Wright
> p...@nomadlogic.org
> @nomadlogicLA
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: New NUMA support coming to CURRENT

2018-01-17 Thread Ultima
Hello Jeff,

Few days ago I upgraded my system firmware, upgraded
base to r327991 and altered snooping to
cluster-on-die. Its hard to say what is making
the server feels like a completely system due
to all these changes (also running llvm 6.0),
but I am betting it is the NUMA optimizations.
It is so responsive! Thanks for the amazing
work!

Best regards,
Richard Gallamore

On Sat, Jan 13, 2018 at 7:39 PM, Jeff Roberson 
wrote:

> Hello,
>
> This work has been committed.  It is governed by a new 'NUMA' config
> option and 'DEVICE_NUMA' and 'VM_NUMA_ALLOC' have both been retired.  This
> option is fairly light weight and I will likely enable it in GENERIC before
> 12.0 release.
>
> I have heard reports that switching from a default policy of first-touch
> to round-robin has caused some performance regression.  You can change the
> default policy at runtime by doing the following:
>
> cpuset -s 1 -n first-touch:all
>
> This is the default set that all others inherit from.  You can query the
> current default with:
> cpuset -g -s 1
>
> I will be investigating the regression and tweaking the default policy
> based on performance feedback from multiple workloads.  This may take some
> time.
>
> numactl is still functional but deprecated.  Man pages will be updated
> soonish.
>
> Thank you for your patience as I work on refining this somewhat involved
> feature.
>
> Thanks,
> Jeff
>
>
> On Tue, 9 Jan 2018, Jeff Roberson wrote:
>
> Hello folks,
>>
>> I am working on merging improved NUMA support with policy implemented by
>> cpuset(2) over the next week.  This work has been supported by Dell/EMC's
>> Isilon product division and Netflix.  You can see some discussion of these
>> changes here:
>>
>> https://reviews.freebsd.org/D13403
>> https://reviews.freebsd.org/D13289
>> https://reviews.freebsd.org/D13545
>>
>> The work has been done in user/jeff/numa if you want to look at svn
>> history or experiment with the branch.  It has been tested by Peter Holm on
>> i386 and amd64 and it has been verified to work on arm at various points.
>>
>> We are working towards compatibility with libnuma and linux mbind.  These
>> commits will bring in improved support for NUMA in the kernel.  There are
>> new domain specific allocation functions available to kernel for UMA,
>> malloc, kmem_, and vm_page*.  busdmamem consumers will automatically be
>> placed in the correct domain, bringing automatic improvements to some
>> device performance.
>>
>> cpuset will be able to constrains processes, groups of processes, jails,
>> etc. to subsets of the system memory domains, just as it can with sets of
>> cpus. It can set default policy for any of the above.  Threads can use
>> cpusets to set policy that specifies a subset of their visible domains.
>>
>> Available policies are first-touch (local in linux terms), round-robin
>> (similar to linux interleave), and preferred.  For now, the default is
>> round-robin.  You can achieve a fixed domain policy by using round-robin
>> with a bitmask of a single domain.  As the scheduler and VM become more
>> sophisticated we may switch the default to first-touch as linux does.
>>
>> Currently these features are enabled with VM_NUMA_ALLOC and MAXMEMDOM.
>> It will eventually be NUMA/MAXMEMDOM to match SMP/MAXCPU.  The current NUMA
>> syscalls and VM_NUMA_ALLOC code was 'experimental' and will be deprecated.
>> numactl will continue to be supported although cpuset should be preferred
>> going forward as it supports the full feature set of the new API.
>>
>> Thank you for your patience as I deal with the inevitable fallout of such
>> sweeping changes.  If you do have bugs, please file them in bugzilla, or
>> reach out to me directly.  I don't always have time to catch up on all of
>> my mailing list mail and regretfully things slip through the cracks when
>> they are not addressed directly to me.
>>
>> Thanks,
>> Jeff
>>
>> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CURRENT: can't buildworld; /usr/bin/ld: error: cannot open crt1.o:

2018-01-14 Thread Ultima
Try updating? just upgraded to r327991 and it was smooth sailing.

On Sun, Jan 14, 2018 at 10:42 PM, O. Hartmann 
wrote:

> One of our CURRENT boxes is repeateadly disobeying to build "buildworld"
> (make
> buildkernel seems to work as I did several kernels right now).
>
> The hosts's world is as of Wednesday, 10th January, the kernel's revison is
>
> FreeBSD 12.0-CURRENT #0 r327871: Fri Jan 12 12:18:19 CET 2018 amd64.
>
> I did, as a test, Friday, 12th Jan, as you can see, the last kernel build.
>
> The host in question also carries a variety of release, package an jail
> builds
> in separate source trees (CURRENT in most cases, to keep them away from the
> host's source tree). Those separate source trees also reject to build.
>
> After performing a "make cleanworld" to startover (even this morning, when
> I
> watched LLVM/CLANG 6.0.0 has slipped in), I face still the same error:
>
> /usr/bin/ld: error: cannot open crt1.o: No such file or directory
>
> More details see below.
>
> The last installation of the system was performed with WITH_LLD_IS_LD and
> WITH_BOOTSTRAP_LLD set, if this is of importance. I still have
> WITH_LLD_IS_LD=YES set in /usr/src.conf.
>
> An we use WITH_META_MODE, just for the record.
>
> I have other machines which didn't get updated on Wednesday, 10th January
> and
> they perform well and without problems (with the same settings in
> /etc/src.conf
> and also WITH_META_MODE).
>
> Can someone give me some hints? How to fix the problem?
>
> Thanks ins advance,
>
> Oliver
>
> [...]
> --
> >>> stage 1.1: legacy release compatibility shims
> --
> cd /usr/src; INSTALL="sh /usr/src/tools/install.sh"
> TOOLS_PREFIX=/usr/obj/usr/src/amd64.amd64/tmp
> PATH=/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/
> usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/
> usr/src/amd64.amd64/tmp/legacy/bin:/sbin:/bin:/usr/sbin:/usr/bin
> WORLDTMP=/usr/obj/usr/src/amd64.amd64/tmp
> MAKEFLAGS="-m /usr/src/tools/build/mk  -m /usr/src/share/mk" make  -f
> Makefile.inc1  DESTDIR=  OBJTOP='/usr/obj/usr/src/
> amd64.amd64/tmp/obj-tools'
> OBJROOT='${OBJTOP}/'  MAKEOBJDIRPREFIX=  BOOTSTRAPPING=1200055
> BWPHASE=legacy
> SSP_CFLAGS=  MK_HTML=no NO_LINT=yes MK_MAN=no  -DNO_PIC MK_PROFILE=no
> -DNO_SHARED  -DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no  MK_CLANG_EXTRAS=no
> MK_CLANG_FULL=no  MK_LLDB=no MK_TESTS=no  MK_LLD=yes  MK_INCLUDES=yes
> legacy
> ===> tools/build (obj,includes,all,install)
> Building /usr/obj/usr/src/amd64.amd64/tmp/obj-tools/tools/build/_
> libinstall
>
> --
> >>> stage 1.2: bootstrap tools
> --
> cd /usr/src; INSTALL="sh /usr/src/tools/install.sh"
> TOOLS_PREFIX=/usr/obj/usr/src/amd64.amd64/tmp
> PATH=/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/
> usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/
> usr/src/amd64.amd64/tmp/legacy/bin:/sbin:/bin:/usr/sbin:/usr/bin
> WORLDTMP=/usr/obj/usr/src/amd64.amd64/tmp
> MAKEFLAGS="-m /usr/src/tools/build/mk  -m /usr/src/share/mk" make  -f
> Makefile.inc1  DESTDIR=  OBJTOP='/usr/obj/usr/src/
> amd64.amd64/tmp/obj-tools'
> OBJROOT='${OBJTOP}/'  MAKEOBJDIRPREFIX=  BOOTSTRAPPING=1200055
> BWPHASE=bootstrap-tools  SSP_CFLAGS=  MK_HTML=no NO_LINT=yes MK_MAN=no
> -DNO_PIC MK_PROFILE=no -DNO_SHARED  -DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no
> MK_CLANG_EXTRAS=no MK_CLANG_FULL=no  MK_LLDB=no MK_TESTS=no  MK_LLD=yes
> MK_INCLUDES=yes bootstrap-tools ===> lib/clang/libllvmminimal
> (obj,all,install)
> Building /usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/clang/
> libllvmminimal/Support/ConvertUTFWrapper.o
> Building /usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/clang/
> libllvmminimal/Support/Debug.o
>
> [...]
>
> Building /usr/obj/usr/src/amd64.amd64/tmp/obj-tools/usr.bin/clang/
> llvm-tblgen/X86RecognizableInstr.o
> Building /usr/obj/usr/src/amd64.amd64/tmp/obj-tools/usr.bin/clang/
> llvm-tblgen/llvm-tblgen
> /usr/bin/ld: error: cannot open crt1.o: No such file or directory
> c++: error: linker command failed with exit code 1 (use -v to see
> invocation)
> *** Error code 1
>
> Stop.
> make[3]: stopped in /usr/src/usr.bin/clang/llvm-tblgen
> .ERROR_TARGET='llvm-tblgen'
> .ERROR_META_FILE='/usr/obj/usr/src/amd64.amd64/tmp/obj-
> tools/usr.bin/clang/llvm-tblgen/llvm-tblgen.meta'
> .MAKE.LEVEL='3'
> MAKEFILE=''
> .MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose'
> _ERROR_CMD='c++ -O2 -pipe -O3
> -I/usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/clang/libllvm
> -I/usr/src/lib/clang/include -I/usr/src/contrib/llvm/include
> -DLLVM_BUILD_GLOBAL_ISEL -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS
> -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd12.0\"
> -DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd12.0\"
> -DDEFAULT_SYSROOT=\"/usr/obj/usr/src/amd64.amd64/tmp\" 

Re: Stale file handle when mounting nfs

2017-10-24 Thread Ultima
Eh... disregard my last message, the mount works properly but
the mount is stuck in a hard lock state.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Stale file handle when mounting nfs

2017-10-24 Thread Ultima
Was a bit tired last night and just wanted to make sure this issue was
known. Upgraded to r324957 and everything works! yay!
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-current, since 323508 / Sept 13 - smartctl / smartmontools - unable to query drives

2017-10-04 Thread Ultima
Just recently tested smartctl (a few days ago) r324135 and I
found that the auto detection seems to be broken but manually
selecting the device type works. Are you positive you tested
every device type?

I was a bit surprised to find that atacam works for my onboard
controller and the HBA only works with sat,auto and scsi

On Wed, Oct 4, 2017 at 11:14 AM, David P. Discher  wrote:

> Seeing an odd behavior - even with r324216 as of about Oct 2nd, smartctl
> can no longer query any disks
>
> Always getting:
>
> > sudo smartctl -x /dev/ada0
> Password:
> smartctl 6.5 2016-05-07 r4318 [FreeBSD 12.0-CURRENT amd64] (local
> build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke,
> www.smartmontools.org
>
> /dev/ada0: Unable to detect device type
> Please specify device type with the -d option.
>
> Use smartctl -h to get a usage summary
>
> Even with specifying the type, I can’t query the drives.  I’m trying to
> nail down the commit now, I think its between r320087 and r323508.
>
> Anyone have any pointers ?  Smartmontools hasn’t changed since Feb, so by
> all indications, this is a regression in FreeBSD-current/-head.
>
> Thanks !
>
> --
> David P. Discher
> https://davidpdischer.com/
>
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: input/output error @boot

2017-03-05 Thread Ultima
Its showing...

elf64_loadimage: read failed
can't load file /boot/kernel/kernel

Can the same when attempting to load kernel.old, had to revert to old
snapshot on usb rescue.
This is not a hardware problem.

On Sun, Mar 5, 2017 at 10:43 AM, Warner Losh  wrote:

> On Sun, Mar 5, 2017 at 8:16 AM, Guido Falsi  wrote:
> > On 03/05/17 16:04, Roberto Rodriguez Jr wrote:
> >> I installed the latest snapshot and when I checked out the latest Source
> >> yesterday evening I rebuilt world and kernel installed kernel rebooted
> >> installed World follow every step in the handbook and I continue to
> have an
> >> input output error at boot time I can't attach a picture at the moment
> or
> >> any text about the error cuz I cannot proceed after the boot screen
> under
> >> UEFI.. I get a ? prompt
> >
> > I am unable to help you about this, I'm not an expert about UEFI, but
> > anyone able to will need some more information.
> >
> > You should at least state make and model of your Mother board and CPU,
> > kind of disks attached, type make and model of controller. If it is a
> > branded box, make and model of the box could suffice.
>
> If it is the UEFI Shell, which is possible, try typing 'FS0:' and see
> if the prompt changes to FS0:\> or similar. Then cd boot and see if
> there's an EFI subdir. cd to EFI subdir and see if there's any .efi
> programs. There should at least be bootx64.efi if there's a loot
> loader. If there is, you can type bootx64.efi. If not, then I'm not
> sure what the ? is.
>
> Warner
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs snapshot_limit is not respected

2017-02-03 Thread Ultima
Yes just tested this and it is how it works.

Thanks for the explanation.

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs snapshot_limit is not respected

2017-02-03 Thread Ultima
Hey Gary,

You are probably right. Do you know how to "lock" this property by chance?
I'v read this exact line several times trying to understand the exact
meaning. The "user is allowed to change the limit" I *think* is referring
to the zfs allow command. The problem is that I checked the dataset and it
is showing no permissions granted to a user. So I guess user in this case
is also including the root user, but how does one lock the property from
root? I keep going through the manpage looking for something I may have
missed but keep coming up empty.

Thanks for replying,
Ultima

On Fri, Feb 3, 2017 at 7:42 AM, Gary Palmer <gpal...@freebsd.org> wrote:

> On Thu, Feb 02, 2017 at 09:31:58PM -0500, Ultima wrote:
> > I recently moved some data on a box with limited space. I decided I
> should
> > limit the snapshots so that space would not become an issue. I just check
> > back a week later to find out the box is hitting the borderline. Doing I
> > quick check I realized that the snapshot_limit is not being respected.
> >
> > # uname -a
> > FreeBSD R1 11.0-STABLE FreeBSD 11.0-STABLE #17 r312232: Sun Jan 15
> 10:59:10
> > EST 2017 root@S1:/usr/src/11-STABLE/obj/usr/src/11-STABLE/src/sys/
> MYKERNEL
> >  amd64
> >
> > # zfs create zroot/bhyve/test
> > # zfs set snapshot_limit=0 zroot/bhyve/test
> > # zfs snapshot zroot/bhyve/test@1
> >
> >
> > # zfs snapshot zroot/bhyve/test@2
> > # zfs snapshot zroot/bhyve/test@3
> > # zfs list -t snapshot | grep zroot/bhyve/test
> > zroot/bhyve/test@1   0  -
> >  88K  -
> > zroot/bhyve/test@2   0  -
> >  88K  -
> > zroot/bhyve/test@3   0  -
> >  88K  -
> > # zfs get all zroot/bhyve/test | grep snapshot
> > zroot/bhyve/test  usedbysnapshots   0  -
> > zroot/bhyve/test  snapshot_limit0  local
> > zroot/bhyve/test  snapshot_count3  local
> >
> > Also wanted to verify 0 was not being mistaken for none.
> >
> > # for snapshot in `zfs list -t snapshot | grep zroot/bhyve/test | awk
> > '{print $1}'`; do zfs destroy $snapshot ; done
> >
> > # zfs get all zroot/bhyve/test | grep snapshot
> > zroot/bhyve/test  usedbysnapshots   0  -
> > zroot/bhyve/test  snapshot_limit0  local
> > zroot/bhyve/test  snapshot_count0  local
> >
> > # zfs set snapshot_limit=1 zroot/bhyve/test
> > # zfs snapshot zroot/bhyve/test@1
> > # zfs snapshot zroot/bhyve/test@2
> > # zfs snapshot zroot/bhyve/test@3
> > # zfs get all zroot/bhyve/test | grep snapshot
> > zroot/bhyve/test  usedbysnapshots   0  -
> > zroot/bhyve/test  snapshot_limit1  local
> > zroot/bhyve/test  snapshot_count3  local
> >
> >
> > Also tested on head
> > FreeBSD S1 12.0-CURRENT FreeBSD 12.0-CURRENT #26 r312388: Wed Jan 18
> > 12:38:52 EST 2017
> > root@S1:/usr/src/head/obj/usr/src/head/src/sys/MYKERNEL-NODEBUG
> >  amd64
>
> Hi,
>
> I suspect this line from the manpage is key:
>
> The limit is not enforced if the user is allowed to change the limit
>
> Regards,
>
> Gary
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


zfs snapshot_limit is not respected

2017-02-02 Thread Ultima
I recently moved some data on a box with limited space. I decided I should
limit the snapshots so that space would not become an issue. I just check
back a week later to find out the box is hitting the borderline. Doing I
quick check I realized that the snapshot_limit is not being respected.

# uname -a
FreeBSD R1 11.0-STABLE FreeBSD 11.0-STABLE #17 r312232: Sun Jan 15 10:59:10
EST 2017 root@S1:/usr/src/11-STABLE/obj/usr/src/11-STABLE/src/sys/MYKERNEL
 amd64

# zfs create zroot/bhyve/test
# zfs set snapshot_limit=0 zroot/bhyve/test
# zfs snapshot zroot/bhyve/test@1


# zfs snapshot zroot/bhyve/test@2
# zfs snapshot zroot/bhyve/test@3
# zfs list -t snapshot | grep zroot/bhyve/test
zroot/bhyve/test@1   0  -
 88K  -
zroot/bhyve/test@2   0  -
 88K  -
zroot/bhyve/test@3   0  -
 88K  -
# zfs get all zroot/bhyve/test | grep snapshot
zroot/bhyve/test  usedbysnapshots   0  -
zroot/bhyve/test  snapshot_limit0  local
zroot/bhyve/test  snapshot_count3  local

Also wanted to verify 0 was not being mistaken for none.

# for snapshot in `zfs list -t snapshot | grep zroot/bhyve/test | awk
'{print $1}'`; do zfs destroy $snapshot ; done

# zfs get all zroot/bhyve/test | grep snapshot
zroot/bhyve/test  usedbysnapshots   0  -
zroot/bhyve/test  snapshot_limit0  local
zroot/bhyve/test  snapshot_count0  local

# zfs set snapshot_limit=1 zroot/bhyve/test
# zfs snapshot zroot/bhyve/test@1
# zfs snapshot zroot/bhyve/test@2
# zfs snapshot zroot/bhyve/test@3
# zfs get all zroot/bhyve/test | grep snapshot
zroot/bhyve/test  usedbysnapshots   0  -
zroot/bhyve/test  snapshot_limit1  local
zroot/bhyve/test  snapshot_count3  local


Also tested on head
FreeBSD S1 12.0-CURRENT FreeBSD 12.0-CURRENT #26 r312388: Wed Jan 18
12:38:52 EST 2017
root@S1:/usr/src/head/obj/usr/src/head/src/sys/MYKERNEL-NODEBUG
 amd64
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ISO image: where is the CLANG compiler?

2017-01-19 Thread Ultima
If you plan on running a desktop environment like this make sure its super
speed usb 3.0 or the system will be considerably slow. This is a perfect
backup plan and I always have one for the ready. This is especially
important when on current. I don't think you have to worry so much about
having a smaller usb, it can always be partitions to fit the you're needs.

On Thu, Jan 19, 2017 at 5:12 AM, Matthias Apitz  wrote:

> El día Thursday, January 19, 2017 a las 10:16:46AM +0100, O. Hartmann
> escribió:
>
> > I created images on CURRENT of my own - they all lack in the ability of
> having
> > the necessary tools aboard. So I consider every image useless for rescue
> > operations except, maybe, the DVD image - but this one is not provided
> anymore.
> > For what reason? Time? Accepted. Space/disk usage? Well, welcome back in
> the
> > stoneage of computer technology ...
>
> No. The process I'm using to create an image for an USB stick leads to a
> complete system from which you can even, after booting it, 'make
> install...'
> to another system mounted on /mnt to the booted USB stick. You can even
> enrich the USB stick with 'pkg install ...' up to a complete running KDE
> desktop system, all running from the USB stick, to test, for example, a
> new hardware if it fits your needs. The stick must be of some 16
> marketing-GB, or bigger.
>
> This has nothing todo with stoneage, but is just a matter of preparing
> something for your needs. Again, let me know if you need this guide.
>
> matthias
>
>
>
> --
> Matthias Apitz, ✉ g...@unixarea.de, ⌂ http://www.unixarea.de/  ☎
> +49-176-38902045
> "Wo ist der antiimperialistische Schutzwall, wenn man ihn braucht?
> US-Panzertransport durch ex-DDR"
> "Where is the anti-imperialistic  wall, if it's needed? Transport of
> US-tanks through the ex-GDR"
> https://deutsch.rt.com/kurzclips/45282-us-panzertransporte-durch-ex-ddr/
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r311568 makes freerdp very slow

2017-01-18 Thread Ultima
I have been affected by this issue as well, just updated to r312388 and it
is fixed. Thanks

On Wed, Jan 18, 2017 at 1:29 PM, John Baldwin  wrote:

> On Tuesday, January 17, 2017 01:46:31 PM Jakob Alvermark wrote:
> > On Fri, January 13, 2017 22:46, Jakob Alvermark wrote:
> > > On Fri, January 13, 2017 19:44, John Baldwin wrote:
> > >
> > >> On Friday, January 13, 2017 09:58:01 AM Jakob Alvermark wrote:
> > >>
> > >>
> > >>> On Thu, January 12, 2017 19:26, John Baldwin wrote:
> > >>>
> > >>>
> >  On Thursday, January 12, 2017 12:42:11 PM Shawn Webb wrote:
> > 
> > 
> > 
> > > On Thu, Jan 12, 2017 at 06:05:08PM +0100, Jakob Alvermark wrote:
> > >
> > >
> > >
> > >> Hi,
> > >>
> > >>
> > >>
> > >>
> > >> r311568 Set MORETOCOME for AIO write requests on a socket.
> > >>
> > >> After this commit freerdp is very slow.
> > >>
> > >>
> > >>
> > >>
> > >> Before the password prompt would appear immediately when
> > >> connecting to a server. Now it takes 5-10 seconds. After
> > >> entering the password, another 5-10 seconds until I am
> > >> connected. Once connected, there is a considerable lag.
> > >>
> > >>
> > >> What could be the problem?
> > >>
> > >>
> > >>
> > >
> > > I don't know what the problem is, but I am seeing the same
> > > symptom.
> > >
> > >
> > 
> >  Can you get a ktrace of the freerdp process during this?  The
> >  commit should only be setting MORETOCOME if multiple aio_write
> >  requests are queued to the same socket (so that TCP can batch them
> >  into a single packet). However, it should not affect an application
> >  just calling aio_write() on a socket once.
> > 
> >  --
> >  John Baldwin
> > 
> > 
> > >>>
> > >>> Hi John,
> > >>>
> > >>>
> > >>>
> > >>> I got the ktrace, what do I do with it?
> > >>>
> > >>>
> > >>
> > >> kdump will generate a text representation, perhaps using 'kdump -s' to
> > >> not include dumps of raw I/O data.  If you can put the output of kdump
> > >> at a URL I can fetch from then I can look at it.
> > >>
> > >
> > > OK, here it is: http://filebin.ca/38mkuLau9Yqu/ktrace.out.xfreerdp.txt
> > >
> > >
> > > Thanks,
> > >
> > >
> > > Jakob
> >
> > Hi,
> >
> > Did you get any chance to look at this?
>
> I have not yet, but can you please try the fix in r312387?
>
> --
> John Baldwin
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs zvol's inaccessible after reboot

2017-01-16 Thread Ultima
The volmode was set to default, which is geom. Setting to 2, dev fixes the
issue! thanks Allan for the speedy solution!

On Mon, Jan 16, 2017 at 1:01 PM, Allan Jude <allanj...@freebsd.org> wrote:

> On 2017-01-16 12:59, Ultima wrote:
> > Currently there is a bug with zvols. I have a few Bhyve containers that
> > startup at boot. I'v noticed in middle December of last year that after a
> > restart the zvols become inaccessible to the container. Nothing can be
> done
> > to the zvol, other than rename. It cannot even be destroyed in this
> state.
> > The only way to make it accessible again is to renaming the zvol, after
> > this occurs, functionality is restored.
> >
> > The bug is still present in head r312232.
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscribe@
> freebsd.org"
> >
>
> Is this because they are being used by GEOM?
>
> Try: zfs set volmode=2 
>
> Reboot, and see if that solves it
>
> --
> Allan Jude
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


zfs zvol's inaccessible after reboot

2017-01-16 Thread Ultima
Currently there is a bug with zvols. I have a few Bhyve containers that
startup at boot. I'v noticed in middle December of last year that after a
restart the zvols become inaccessible to the container. Nothing can be done
to the zvol, other than rename. It cannot even be destroyed in this state.
The only way to make it accessible again is to renaming the zvol, after
this occurs, functionality is restored.

The bug is still present in head r312232.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sluggish performance on head r311648

2017-01-11 Thread Ultima
> One thing I keep a look out for, is in gstat, if one drive is busier
> than the others, it's a clear sign of a drive dying. Offlining and
> replacing the drive usually makes a huge difference.

> Also, in all of these cases, SMART data shows no sign of problem and no
> errors in 'zpool status'. Just watch and see if any of the drives is
> working harder than the others has been the surest way to troubleshoot
> performance issues, in my experience.

Staring at gstat for about 5 minutes, I'm not really sure if any one stand
out. They seem to all vary in activity. Something that does stand out is
that on occasion a few will spike in the red, sometimes 1 or 2, other times
8ish (24 drives in pool total). I also notice once, instead of all drives
working at the same time, it seems to move like a wave. Red activity hit at
the top of gstat and worked its way down. Not all drives hit red during
this wave, around 16 in this 5ish seconds. Not sure if this is out of the
ordinary tho.

I did look at SMART before porting. One thing I thought about is the
corrected errors amount for each drive. When I get some time I'll create a
graph and try and determine the possible bad drive(s, hopefully without the
s) based on this information.




> Just to eliminate the simple - is the zpool capacity high? When a pool
> gets into the 80-90% capacity, performance drops.
The pool is at 28% capacity atm according to zpool list.

On Wed, Jan 11, 2017 at 9:03 PM, Shane Ambler <free...@shaneware.biz> wrote:

> On 11/01/2017 15:32, Ultima wrote:
>
>> I'v been noticing lately sluggish performance, maybe zfs? First noticed
>> this a few days ago right after upgrading on Jan 7th to r311648 and the
>> last upgrade before that was around dec 30-jan 1 (not sure of rev).
>> Decided
>> to upgrade again today. I usually build and install head every week or
>> two,
>> but I have been extremely busy the past couple months.
>>
>> FreeBSD U1 12.0-CURRENT FreeBSD 12.0-CURRENT #16 r311903: Tue Jan 10
>> 17:20:11 EST 2017 amd64
>>
>> Normally when one of my services scans a few directories it takes about 15
>> seconds tops, it has been taking several minutes. I want to note that this
>>
>
> Just to eliminate the simple - is the zpool capacity high? When a pool
> gets into the 80-90% capacity, performance drops.
>
>
>
> --
> FreeBSD - the place to B...Storing Data
>
> Shane Ambler
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Sluggish performance on head r311648

2017-01-10 Thread Ultima
I'v been noticing lately sluggish performance, maybe zfs? First noticed
this a few days ago right after upgrading on Jan 7th to r311648 and the
last upgrade before that was around dec 30-jan 1 (not sure of rev). Decided
to upgrade again today. I usually build and install head every week or two,
but I have been extremely busy the past couple months.

FreeBSD U1 12.0-CURRENT FreeBSD 12.0-CURRENT #16 r311903: Tue Jan 10
17:20:11 EST 2017 amd64

Normally when one of my services scans a few directories it takes about 15
seconds tops, it has been taking several minutes. I want to note that this
service is running inside a jail with vnet enabled. Also the directory it
scans is a nullfs to a dataset. This is just one of the many side effects
i'v been noticing, another is extremely slow reads with bhyve+zvol.
Starting to wonder if it maybe hardware related.

I decided to check commits log before posting and not much has happened to
zfs recently other than, Add missed vfs.zfs.zfetch.max_idistance in
r309833, and 309714 which doesn't really look related (could be wrong). But
these were committed about a month ago so that is not likely the case.


Anyone else experiences similar results as of recent?


Sorry for being noise if this is hardware related.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zpool (online|replace|labelclear) issues, -f option also failing

2016-09-29 Thread Ultima
This patch looks good its too bad the core hasn't looked at it yet. Ill do
some testing and report any issues found. Some points you bring up in the
bug are quite valid. Thanks for working on this! =]

Ultima

On Thu, Sep 29, 2016 at 8:25 AM, Ganael LAPLANCHE <marty...@freebsd.org>
wrote:

> On Wed, 28 Sep 2016 19:00:42 -0400, Ultima wrote
>
> Hi Ultima,
>
> > I really think the force (-f) flag needs a bump in power (for both
> > replace and labelclear) [...].
>
> In case you are interested, I have posted a patch to improve the 'zpool
> labelclear' command here :
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204622
>
> - force ('-f') will now allow to erase labels, even if they are broken
> (restoring the behaviour of our previous -FreeBSD- version of labelclear)
> - You can select which label you want to erase (beginning, end or
> specific index)
> - You can use a 'minimal' mode which will invalidate a label by changing
> only a single byte. This is useful to minimize the chances to
> overwrite/destroy a FS that would have been created over the label, but
> still leaving the label visible.
>
> Regards,
>
> --
> Ganael LAPLANCHE <ganael.laplan...@martymac.org>
> http://www.martymac.org | http://contribs.martymac.org
> FreeBSD: martymac <marty...@freebsd.org>, http://www.FreeBSD.org
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zpool (online|replace|labelclear) issues, -f option also failing

2016-09-28 Thread Ultima
> Hi,

> As a start you can use these in /boot/loader.conf to prevent the
confusion about gptid or disk_ident. I disabled gptid at my computer. But
if > I understand you would like to disable disk_ident. For ZFS it should
not matter what you use.

> $ sysctl kern.geom.label
> kern.geom.label.disk_ident.enable: 1
> kern.geom.label.gptid.enable: 0
> kern.geom.label.gpt.enable: 1
> kern.geom.label.ufs.enable: 1
> kern.geom.label.ufsid.enable: 1
> kern.geom.label.reiserfs.enable: 1
> kern.geom.label.ntfs.enable: 1
> kern.geom.label.msdosfs.enable: 1
> kern.geom.label.iso9660.enable: 1
> kern.geom.label.ext2fs.enable: 1
> kern.geom.label.debug: 0

Thanks for that, this would probably work, but I don't understand why it
would change in the first place. I know that when it occurred it was
offline and I think it came back online when the system was rebooted. I'm
not positive tho. My guess is the scan found it on diskid before dptid, but
then why is gptid first for the others? I'm just going to replace the drive
with itself with gptid because I'v already wiped some data with dd. (even
tho a scrub would prob be good enough)

> Further. Does ZFS see 14989197580381994958 and
gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6 as the same disk? Zpool replace
also has an option to replace the disk 'with itself'. Just provide it one
parameter like this:
> # zpool replace tank 14989197580381994958
> or
> # zpool replace tank gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6
> Does that help?

I actually didn't realize this. However the same error persists.

# zpool replace tank gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6
invalid vdev specification
the following errors must be manually repaired:
/dev/gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6 is part of active pool
'tank'

# zpool replace -f tank /dev/gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6
invalid vdev specification
the following errors must be manually repaired:
/dev/gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6 is part of active pool
'tank'

> Oh, while I read your mail again. You have 2 GB swap configured on the
disk so wiping 2MB at the start of the disk does not wipe the freebsd-zfs
metadata of the da14p2 partition. Try wiping 3GB from the start and end of
the disk and repartition it.


Thanks for pointing this out! It would probably help if the correct area on
the disk is wiped. Although it still seems that labelclear isn't up for the
task. I really think the force (-f) flag needs a bump in power (for both
replace and labelclear). Am I misunderstanding the use for the labelclear
command? It clears the label that zdb will show for possibly similar
circumstances that i'm encountering?

# zpool labelclear -f gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6
/dev/gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6 is a member (ACTIVE) of
pool "tank"

Apologies, I failed to mention labelclear in my original post. It is
providing similar output as the replace command.

As the device is offline from the pool. Is this the correct behavior to
show being an (ACTIVE) member of the pool? After wiping the correct area on
the disk via dd, the replace successfully added the drive back to the pool!
Thanks for pointing out my error.

Thanks for taking a look at this Ronald and Allan!

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


zpool (online|replace|labelclear) issues, -f option also failing

2016-09-27 Thread Ultima
Hello,

 I am currently trying to replace a disk that was offlined and getting the
following error:

# zpool replace tank 14989197580381994958
gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6
invalid vdev specification
use '-f' to override the following errors:
/dev/gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6 is part of active pool
'tank'

# zpool replace -f tank 14989197580381994958
gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6
invalid vdev specification
the following errors must be manually repaired:
/dev/gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6 is part of active pool
'tank'

# zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
  scan: resilvered 1.10T in 9h4m with 0 errors on Tue Sep 20 00:33:32 2016
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
 raidz2-0  ONLINE   0 0 0
   gptid/8bdbd180-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8c4df91d-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8ccf21a3-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8d5521cb-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8de13b47-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8e842f92-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
 raidz2-1  DEGRADED 0 0 0
   gptid/8bba4a82-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8c26d491-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8ca3fea6-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   14989197580381994958OFFLINE  0 0 0
 was /dev/diskid/DISK-p2
   gptid/8db26351-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8e4bfa70-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
 raidz2-2  ONLINE   0 0 0
   gptid/8b957b47-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8c0340da-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8c77ddcb-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8cf6b7f1-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8d84b31e-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8e146dad-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
 raidz2-3  ONLINE   0 0 0
   gptid/8ebb39df-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8ef49770-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/2f94035d-7e9f-11e6-abe9-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8f69cf08-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8fa7c0a6-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
   gptid/8fe7816d-f52a-11e5-90c5-fcaa14edc6a6  ONLINE   0 0 0
logs
 gptid/683dc146-f531-11e5-90c5-fcaa14edc6a6ONLINE   0 0 0

errors: No known data errors

# glabel status | grep da14
gptid/24a57a9b-84f0-11e6-bbbc-fcaa14edc6a6 N/A  da14p1
gptid/31be0527-84f0-11e6-bbbc-fcaa14edc6a6 N/A  da14p2
  diskid/DISK- N/A  da14

# gpart show da13 da14
=>40  7814037088  da13  GPT  (3.6T)
  40 4194304 1  freebsd-swap  (2.0G)
 4194344  7809842784 2  freebsd-zfs  (3.6T)

=>40  7814037088  da14  GPT  (3.6T)
  40 4194304 1  freebsd-swap  (2.0G)
 4194344  7809842784 2  freebsd-zfs  (3.6T)

# uname -a
FreeBSD S1 12.0-CURRENT FreeBSD 12.0-CURRENT #4 r306300: Sat Sep 24
14:24:23 EDT 2016
root@S1:/usr/src/head/obj/usr/src/head/src/sys/MYKERNEL-NODEBUG
 amd64

I recently offlined the device and after onlining it the label changed to
geom. After a few reboots the pool started importing by diskid. After
attempting to offline/online by gptid, would continue to fail with an
error. I decided try to replace it and is also failing with the error
above. I also wiped the first & last 2MB of the disk without success. Is
they're a known issue or perhaps I'm missing something obvious? zpool
labelclear is also providing a similar error. The -f options are not
helping.


Any ideas what my issue maybe? The error suggests it is currently active on
the pool, however the offline should have changed that status correct?

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Need help debugging rc script

2016-09-01 Thread Ultima
 Hello, I'm dealing with an odd issue with an rc script working pre-11. The
change that is causing the prog to fail seems to be the new limits feature,
$name_login_class. After commenting out the limits command in rc.subr, the
script starts as expected. Adding limits directly in the rc script with the
daemon class will also start the script as expected (with limits commented
out in rc.subr). For some reason that I cannot fathom it will just not
start correctly with limits in rc.subr.

 So as the topic suggests, are they're better way to debug this? I'll also
provide the scripts used in case someone may see something I'v missed. The
jail is running on current, however this exists in 11 as well.

12.0-CURRENT FreeBSD 12.0-CURRENT #21 r304105
audio/teamspeak3-server

# cat /usr/local/etc/rc.d/teamspeak
#!/bin/sh

# $FreeBSD$
#
# PROVIDE: teamspeak
# REQUIRE: LOGIN
# KEYWORD: shutdown
#
# Add the following lines to /etc/rc.conf.local or /etc/rc.conf
# to enable this service:
#
# teamspeak_enable (bool):   Set to NO by default.
#   Set it to YES to enable teamspeak server.
#

. /etc/rc.subr

name="teamspeak"
rcvar=teamspeak_enable

db_dir=/var/db/teamspeak
log_dir=/var/log/teamspeak

pidfile=/var/db/teamspeak/teamspeak_server.pid
procname=/usr/local/libexec/ts3server
command=/usr/sbin/daemon
command_args="-fp $pidfile -u teamspeak /usr/local/libexec/ts3server
dbsqlpath=/usr/local/share/teamspeak/server/sql/
inifile=/usr/local/etc/teamspeak/ts3server.ini
licensepath=/usr/local/etc/teamspeak/ logpath=$log_dir"
teamspeak_chdir=$db_dir
required_dirs="$db_dir $log_dir"

load_rc_config $name

: ${teamspeak_enable="NO"}

LD_LIBRARY_PATH=/usr/local/lib/teamspeak/server:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH

run_rc_command "$1"



This script causes an error,
2016-09-01 16:53:09.510292|ERROR   |DatabaseQuery |   |db_connect() failed
unable to open database file
2016-09-01 16:53:09.510352|CRITICAL|ServerLibPriv |   |Server()
DatabaseError out of memory


Commenting line 1075 in /etc/rc.subr, the program starts successfully

Keeping line 1075 commented, change rc script to this.

# cat /usr/local/etc/rc.d/teamspeak
#!/bin/sh

# $FreeBSD$
#
# PROVIDE: teamspeak
# REQUIRE: LOGIN
# KEYWORD: shutdown
#
# Add the following lines to /etc/rc.conf.local or /etc/rc.conf
# to enable this service:
#
# teamspeak_enable (bool):   Set to NO by default.
#   Set it to YES to enable teamspeak server.
#

. /etc/rc.subr

name="teamspeak"
rcvar=teamspeak_enable

db_dir=/var/db/teamspeak
log_dir=/var/log/teamspeak

pidfile=/var/db/teamspeak/teamspeak_server.pid
procname=/usr/local/libexec/ts3server
command=/usr/bin/limits
command_args="-C daemon /usr/sbin/daemon -fp $pidfile -u teamspeak
/usr/local/libexec/ts3server
dbsqlpath=/usr/local/share/teamspeak/server/sql/
inifile=/usr/local/etc/teamspeak/ts3server.ini
licensepath=/usr/local/etc/teamspeak/ logpath=$log_dir"
teamspeak_chdir=$db_dir
required_dirs="$db_dir $log_dir"

load_rc_config $name

: ${teamspeak_enable="NO"}

LD_LIBRARY_PATH=/usr/local/lib/teamspeak/server:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH

run_rc_command "$1"


This is also start successfully.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Possible zpool online, resilvering issue

2016-08-10 Thread Ultima
95 Hardware_ECC_Recovered  0x001a   053   011   000Old_age   Always
-   20189561
197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always
-   0
198 Offline_Uncorrectable   0x0010   100   100   000Old_age   Offline
   -   0
199 UDMA_CRC_Error_Count0x003e   200   200   000Old_age   Always
-   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining  LifeTime(hours)
 LBA_of_first_error
# 1  Extended offlineCompleted without error   00% 17423
  -
# 2  Short offline   Completed without error   00% 17412
  -
# 3  Short offline   Completed without error   00% 17340
  -
# 4  Short offline   Completed without error   00% 17293
  -
# 5  Extended offlineCompleted without error   00% 17261
  -
# 6  Short offline   Completed without error   00% 17245
  -
# 7  Short offline   Completed without error   00% 17173
  -
# 8  Short offline   Completed without error   00% 17125
  -
# 9  Extended offlineCompleted without error   00% 17101
  -
#10  Short offline   Completed without error   00% 17084
  -
#11  Short offline   Completed without error   00% 17012
  -
#12  Short offline   Completed without error   00% 16964
  -
#13  Extended offlineCompleted without error   00% 16927
  -
#14  Short offline   Completed without error   00% 16916
  -
#15  Short offline   Completed without error   00% 16916
  -
#16  Short offline   Completed without error   00% 16844
  -
#17  Short offline   Completed without error   00% 16805
  -
#18  Extended offlineCompleted without error   00% 16775
  -
#19  Short offline   Completed without error   00% 16757
  -
#20  Short offline   Completed without error   00% 16685
  -
#21  Short offline   Completed without error   00% 16637
  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
100  Not_testing
200  Not_testing
300  Not_testing
400  Not_testing
500  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

On Wed, Aug 10, 2016 at 2:56 PM, olli hauer <oha...@gmx.de> wrote:

> On 2016-08-04 07:22, Ultima wrote:
> > Hello,
> >
> > I recently had some issue with a PSU and ran several scrubs on a pool
> with
> > around 35T. Random drives would drop and require a zpool online, this
> found
> > checksum errors. (as expected) However, after all the scrubs I ran, I
> think
> > I may have found a bug with zpool online resilvering process.
> >
> > 24 disks total, 4 vdevs raidz2 (6 drives each).
> >
> > Before this next part... I had a backup PSU, however it was also going
> bad
> > and waiting for RMA. The current one seemed to be dieing but ran fine
> with
> > less drives. So I decided I would run the server short 4 drives.
> >
> > Started by offline(or already removed from psu) 4 drives from different
> > vdevs, then ran a scrub to verify everything. Many sum errors were
> present
> > on some of the drives, but this was expected due to faulty psu. Then
> > offlined 4 different drives and onlined the other 4 and scrubbed once
> > again. After resilver, again, many sum errors on these drives as
> expected.
> >
> > After the scrub completed, I decided to offline 4 different drives, then
> > online the ones that were out of pool for awhile. During the resilver,
> > checksum errors were once again found. I was surprised due to the recent
> > scrub, So I decided to run another scrub, and it found even more checksum
> > errors on these recently onlined drives. I didn't think much about it,
> > however after the replacement PSU arrived, I onlined all the drives out
> of
> > pool and again, resilver had checksum errors as well as another scrub
> with
> > more sum errors.
> >
> > Is this issue known? Is it common for a scrub to be required after
> onlining
> > a disk that was out of pool for some time?
> >
> > The drives are ST4000NM0033, and until recent have never had a single
> > checksum error in they're lifetime.(at least with zfs)
> > FreeBSD S1 12.0-CURRENT FreeBSD 12.0-CURRENT #19 r303224: Sat Jul 23
> > 10:41:12 EDT 2016
> > root@S1:/usr/src/head/obj/usr/src/head/src/sys/MYKERNEL-NODEBUG
> >  amd64
> >
> >
> > Sorry for the wall of text, but I hop

Re: Possible zpool online, resilvering issue

2016-08-10 Thread Ultima
Hello,

> I didn't see any reply on the list, so I thought I might let you know

Sorry, never received this reply (till now) xD

>what I assume is happening:

> ZFS never updates data in place, which affects inode updates, e.g. if
> a file has been read and access times must be updated. (For that reason,
> many ZFS file systems are configured to ignore access time updates).

> Even if there were only R/O accesses to files in the pool, there will
> have been updates to the inodes, which were missed by the offlined
> drives (unless you ignore atime updates).

> But even if there are no access time updates, ZFS might have written
> new uberblocks and other meta information. Check the POOL history and
> see if there were any TXGs created during the scrub.

> If you scrub the pooll while it is off-line, it should stay stable
> (but if any information about the scrub, the offlining of drives etc.
> is recorded in the pool's history log, differences are to be expected).

> Just my $.02 ...

> Regards, STefan

Thanks for the reply, I'm not completely sure what would be considered a
TXG. Maintained normal operations during most this noise and this pool has
quite a bit of activity during normal operations. My zpool history looks
like it gos on forever and the last scrub is showing it repaired 9.48G.
That was for all these access time updates? I guess that would be a little
less then 2.5G per disk worth.

The zpool history looks like it gos on forever (733373 lines). This pool
has much of this activity with poudriere. All the entries I see are clone,
destroy, rollback and snapshotting. I can't really say how much but at
least 500 (prob much more than that) entries between the last two scrubs.
Atime is off on all datasets.

 So to be clear, this is expected behavior with atime=off + TXGs during
offline time? I had thought that the resilver after onlining the disk would
bring that disk up-to-date with the pool. I guess my understanding was a
bit off.

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


rc scripts new login_class, default can break old rc scripts

2016-08-06 Thread Ultima
 I recently upgraded one of my boxes to FreeBSD 11 r303750 (beta-3). After
the upgrade I noticed that one of the services would no longer start...

 After digging into it, I found that the new var ${name}_login_class var's
defaults to the daemon login class and by default, the daemon class
resource limit on memory is set to 128M. This maybe an issue for old rc
scripts.

So my question is this, should old rc scripts adapt to this new default, or
should the default be changed to avoid issues like I just found? The port
this issue was found is audio/teamspeak3-server. If installed on FreeBSD
11+ it will fail to start with...
2016-08-06 17:07:27.946432|CRITICAL|ServerLibPriv |   |Server()
DatabaseError out of memory

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Possible zpool online, resilvering issue

2016-08-03 Thread Ultima
Hello,

I recently had some issue with a PSU and ran several scrubs on a pool with
around 35T. Random drives would drop and require a zpool online, this found
checksum errors. (as expected) However, after all the scrubs I ran, I think
I may have found a bug with zpool online resilvering process.

24 disks total, 4 vdevs raidz2 (6 drives each).

Before this next part... I had a backup PSU, however it was also going bad
and waiting for RMA. The current one seemed to be dieing but ran fine with
less drives. So I decided I would run the server short 4 drives.

Started by offline(or already removed from psu) 4 drives from different
vdevs, then ran a scrub to verify everything. Many sum errors were present
on some of the drives, but this was expected due to faulty psu. Then
offlined 4 different drives and onlined the other 4 and scrubbed once
again. After resilver, again, many sum errors on these drives as expected.

After the scrub completed, I decided to offline 4 different drives, then
online the ones that were out of pool for awhile. During the resilver,
checksum errors were once again found. I was surprised due to the recent
scrub, So I decided to run another scrub, and it found even more checksum
errors on these recently onlined drives. I didn't think much about it,
however after the replacement PSU arrived, I onlined all the drives out of
pool and again, resilver had checksum errors as well as another scrub with
more sum errors.

Is this issue known? Is it common for a scrub to be required after onlining
a disk that was out of pool for some time?

The drives are ST4000NM0033, and until recent have never had a single
checksum error in they're lifetime.(at least with zfs)
FreeBSD S1 12.0-CURRENT FreeBSD 12.0-CURRENT #19 r303224: Sat Jul 23
10:41:12 EDT 2016
root@S1:/usr/src/head/obj/usr/src/head/src/sys/MYKERNEL-NODEBUG
 amd64


Sorry for the wall of text, but I hope this helps in tracking down this
possible bug.

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: mfi driver performance too bad on LSI MegaRAID SAS 9260-8i

2016-08-01 Thread Ultima
If anyone is interested, as Michelle Sullivan just mentioned. One problem I
found when looking for an HBA is that they are not so easy to find. Scoured
the internet for a backup HBA I came across these -
http://www.avagotech.com/products/server-storage/host-bus-adapters/#tab-12Gb1

Can only speak for sas-9305-24i. All 24 bays are occupied and quite pleased
with the performance compared to its predecessor. It was originally going
to be a backup unit, however that changed after running a scrub and the
amount of hours to complete cut in half (around 30ish to 15 for 35T). And
of course, the reason for this post, it replaced a raid card in passthrough
mode.

Another note, because it is an HBA, the ability to flash firmware is once
again possible! (yay!)

+1 to HBA's + ZFS, if possible replace it for an HBA.

On Mon, Aug 1, 2016 at 1:30 PM, Michelle Sullivan 
wrote:

> Borja Marcos wrote:
>
>> On 01 Aug 2016, at 15:12, O. Hartmann 
>>> wrote:
>>>
>>> First, thanks for responding so quickly.
>>>
>>> - The third option is to make the driver expose the SAS devices like a
 HBA
 would do, so that they are visible to the CAM layer, and disks are
 handled by
 the stock “da” driver, which is the ideal solution.

>>> I didn't find any switch which offers me the opportunity to put the PRAID
>>> CP400i into a simple HBA mode.
>>>
>> The switch is in the FreeBSD mfi driver, the loader tunable I mentioned,
>> regardless of what the card
>> firmware does or pretends to do.
>>
>> It’s not visible doing a "sysctl -a”, but it exists and it’s unique even.
>> It’s defined here:
>>
>>
>> https://svnweb.freebsd.org/base/stable/10/sys/dev/mfi/mfi_cam.c?revision=267084=markup
>> (line 93)
>>
>> In order to do it you need a couple of things. You need to set the
 variable
 hw.mfi.allow_cam_disk_passthrough=1 and to load the mfip.ko module.

 When booting installation media, enter command mode and use these
 commands:

 -
 set hw.mfi.allow_cam_disk_passthrough=1
 load mfip
 boot
 ———

>>> Well, I'm truly aware of this problemacy and solution (now), but I run
>>> into a
>>> henn-egg-problem, literally. As long as I can boot off of the
>>> installation
>>> medium, I have a kernel which deals with the setting. But the boot
>>> medium is
>>> supposed to be a SSD sitting with the PRAID CP400i controller itself!
>>> So, I
>>> never be able to boot off the system without crippling the ability to
>>> have a
>>> fullspeed ZFS configuration which I suppose to have with HBA mode, but
>>> not
>>> with any of the forced RAID modes offered by the controller.
>>>
>> Been there plenty of times, even argued quite strongly about the
>> advantages of ZFS against hardware based RAID
>> 5 cards. :) I remember when the Dell salesmen couldn’t possibly
>> understand why I wanted a “software based RAID rather than a
>> robust, hardware based solution” :D
>>
>
> There are reasons for using either...
>
> Nowadays its seems the conversations have degenerated into those like
> Windows vs Linux vs Mac where everyone thinks their answer is the right one
> (just as you suggested you (Borja Marcos) did with the Dell salesman),
> where in reality each has its own advantages and disadvantages.  Eg: I'm
> running 2 zfs servers on 'LSI 9260-16i's... big mistake! (the ZFS, not
> LSI's)... one is a 'movie server' the other a 'postgresql database'
> server...  The latter most would agree is a bad use of zfs, the die-hards
> won't but then they don't understand database servers and how they work on
> disk.  The former has mixed views, some argue that zfs is the only way to
> ensure the movies will always work, personally I think of all the years
> before zfs when my data on disk worked without failure until the disks
> themselves failed... and RAID stopped that happening...  what suddenly
> changed, are disks and ram suddenly not reliable at transferring data? ..
> anyhow back to the issue there is another part with this particular
> hardware that people just throw away...
>
> The LSI 9260-* controllers have been designed to provide on hardware
> RAID.  The caching whether using the Cachecade SSD or just oneboard ECC
> memory is *ONLY* used when running some sort of RAID set and LVs... this is
> why LSI recommend 'MegaCli -CfgEachDskRaid0' because it does enable
> caching..  A good read on how to setup something similar is here:
> https://calomel.org/megacli_lsi_commands.html (disclaimer, I haven't
> parsed it all so the author could be clueless, but it seems to give
> generally good advice.)  Going the way of 'JBOD' is a bad thing to do, just
> don't, performance sucks. As for the recommended command above, can't
> comment because currently I don't use it nor will I need to in the near
> future... but...
>
> If you (O Hartmann) want to use or need to use ZFS with any OS including
> FreeBSD don't go with the LSI 92xx series controllers, its just the wrong
> 

Re: Boot environments and zfs canmount=noauto

2016-07-28 Thread Ultima
Is this actually required? Instead of the extra steps, how about just
moving/creating each dataset into each BE (or at least the ones desired)
and setting them to mountpoint=inherit. With zfs_enable="YES" this should
mount the dataset appropriately. Only the root BE should need
canmount=noauto because if the parent dataset is not mounted the child will
inherently not mount.

Is this not sufficient?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sr-iov's virtual function driver shipped broken?

2016-07-12 Thread Ultima
Ah, well I posted on the mailing list during Q1 about the issue. Also asked
if it is planned to be fixed for release during Q2? and the response was
that they're is an issues with the current implementation and it is being
worked on, so I never created a bug for it.

https://lists.freebsd.org/pipermail/freebsd-current/2016-March/060350.html

Just opened PR211062 for it.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211062

On Tue, Jul 12, 2016 at 2:24 PM, Ngie Cooper <yaneurab...@gmail.com> wrote:

>
> > On Jul 12, 2016, at 11:21, Ultima <ultima1...@gmail.com> wrote:
> >
> > I'v mentioned this in the past, but I just want to verify. Will 11 be
> > released with the virtual function driver unusable? Currently iovctl will
> > only work in pass-through mode.
>
> Hi,
> Is there a bug open for this issue with a repro/more details?
> Thanks,
> -Ngie
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


sr-iov's virtual function driver shipped broken?

2016-07-12 Thread Ultima
I'v mentioned this in the past, but I just want to verify. Will 11 be
released with the virtual function driver unusable? Currently iovctl will
only work in pass-through mode.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NO_INSTALLEXTRAKERNELS and PkgBase

2016-05-09 Thread Ultima
If multiple kernels are being installed like this, eg KERNCONF="FOO BAR",
which of the two would be default during boot? FOO because it came first?

On Mon, May 9, 2016 at 2:05 PM, John Baldwin  wrote:

> On Saturday, May 07, 2016 06:50:05 AM David Wolfskill wrote:
> > [Recipient list trimmed a bit -- dhw]
> >
> > I'm speaking up here because IIRC, I whined to Gleb at what I perceived
> > to be a POLA violation a while back
> >
> > On Sat, May 07, 2016 at 09:59:06AM +0200, Ben Woods wrote:
> > > On 7 May 2016 at 09:48, Ngie Cooper (yaneurabeya) <
> yaneurab...@gmail.com>
> > > wrote:
> > >
> > > > glebius changed the defaults to fix POLA, but the naming per the
> behavior
> > > > is confusing. Right now the behavior between ^/head and ^/stable/10
> > > > before/now match -- I just had to wrap my mind around the default
> being the
> > > > affirmative of a negative (i.e. only install one kernel, as opposed
> to
> > > > install all extra kernels by default).
> > > > -Ngie
> > >
> > >
> > > Indeed, I am not sure I understand the POLA violation entirely
> (ignoring
> > > the fact that this variable requires affirmation of a negative).
> > >
> > > If you list 2 kernels in the KERNCONF variable, why is it astonishing
> that
> > > 2 kernels get installed? Even if the old behaviour was to only install
> 1
> > > kernel, if you are listing 2 kernels in KERNCONF presumably that is
> because
> > > you want to install 2 kernels?
> >
> > Errr... no: I don't.  At least, not on the machine where I built them.
>
> Then don't pass them to 'installkernel'?  That is, I think this makes sense
> if you want to build N kernels but only install 1:
>
> make buildkernel KERNCONF="FOO BAR BAZ"
>
> # only install the FOO kernel
>
> make installkernel KERNCONF="FOO"
>
> And then if you want to install multiple:
>
> # install both FOO and BAR kernels
>
> make installkernel KERNCONF="FOO BAR"
>
> The runaround seems to be whether this last case now should require
> multiple
> explicit installkernel invocations which I find inconsistent since the
> build
> stage doesn't.  I would fully expect 'installkernel' to install all of the
> kernels listed in KERNCONF and would assume that it is up to the invoker to
> choose KERNCONF appropriately.
>
> --
> John Baldwin
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ixv device_attach: ixv0 attach returned 5 (was sr-iov issues, reset_hw() failed with error -100)

2016-04-28 Thread Ultima
Jeb,

Okay, just wanted to make sure it was known. =]

Ultima

On Thu, Apr 28, 2016 at 4:36 PM, cramerj <cram...@intel.com> wrote:

> Yeah, we're running into some issues with our implementation.  I'm working
> on the VF driver now and hope to get some of these bugs fixed within the
> next couple weeks.
>
> Thanks,
> -Jeb
>
> -Original Message-
> From: owner-freebsd-curr...@freebsd.org [mailto:
> owner-freebsd-curr...@freebsd.org] On Behalf Of Ultima
> Sent: Sunday, April 24, 2016 8:32 PM
> Cc: freebsd-current@freebsd.org; freebsd-virtualizat...@freebsd.org;
> freebsd-hardw...@freebsd.org
> Subject: ixv device_attach: ixv0 attach returned 5 (was sr-iov issues,
> reset_hw() failed with error -100)
>
>  The sr-iov vf driver is failing to attach.
>
>
> # pciconf -lv: (filtered to only relevant output)
> ix0@pci0:129:0:0: class=0x02 card=0x1458 chip=0x15288086 rev=0x01
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Controller 10-Gigabit X540-AT2'
> class  = network
> subclass   = ethernet
> ix1@pci0:129:0:1: class=0x02 card=0x1458 chip=0x15288086 rev=0x01
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Controller 10-Gigabit X540-AT2'
> class  = network
> subclass   = ethernet
> none155@pci0:129:0:129: class=0x02 card=0x1458 chip=0x15158086
> rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'X540 Ethernet Controller Virtual Function'
> class  = network
> subclass   = ethernet
> none156@pci0:129:0:131: class=0x02 card=0x1458 chip=0x15158086
> rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'X540 Ethernet Controller Virtual Function'
> class  = network
> subclass   = ethernet
>
> # devctl attach pci0:129:0:129
> devctl: Failed to attach pci0:129:0:129: Input/output error
>
> # dmesg
> ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version -
> 1.4.6-k> at device 0.129 numa-domain 1 on pci12
> ixv0: Using MSIX interrupts with 2 vectors
> ixv0: ixgbe_reset_hw() failed with error -100
> device_attach: ixv0 attach returned 5
>
> # cat /etc/iovctl.conf
> PF {
> device : ix1;
> num_vfs : 31;
> }
>
> DEFAULT {
> passthrough : true;
> }
> VF-0 {
> passthrough : false;
> }
> VF-1 {
> passthrough : false;
> }
>
>
> Any ideas?
>
> Ultima
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


ixv device_attach: ixv0 attach returned 5 (was sr-iov issues, reset_hw() failed with error -100)

2016-04-24 Thread Ultima
 The sr-iov vf driver is failing to attach.


# pciconf -lv: (filtered to only relevant output)
ix0@pci0:129:0:0: class=0x02 card=0x1458 chip=0x15288086 rev=0x01
hdr=0x00
vendor = 'Intel Corporation'
device = 'Ethernet Controller 10-Gigabit X540-AT2'
class  = network
subclass   = ethernet
ix1@pci0:129:0:1: class=0x02 card=0x1458 chip=0x15288086 rev=0x01
hdr=0x00
vendor = 'Intel Corporation'
device = 'Ethernet Controller 10-Gigabit X540-AT2'
class  = network
subclass   = ethernet
none155@pci0:129:0:129: class=0x02 card=0x1458 chip=0x15158086
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'X540 Ethernet Controller Virtual Function'
class  = network
subclass   = ethernet
none156@pci0:129:0:131: class=0x02 card=0x1458 chip=0x15158086
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'X540 Ethernet Controller Virtual Function'
class  = network
subclass   = ethernet

# devctl attach pci0:129:0:129
devctl: Failed to attach pci0:129:0:129: Input/output error

# dmesg
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version -
1.4.6-k> at device 0.129 numa-domain 1 on pci12
ixv0: Using MSIX interrupts with 2 vectors
ixv0: ixgbe_reset_hw() failed with error -100
device_attach: ixv0 attach returned 5

# cat /etc/iovctl.conf
PF {
device : ix1;
num_vfs : 31;
}

DEFAULT {
passthrough : true;
}
VF-0 {
passthrough : false;
}
VF-1 {
passthrough : false;
}


Any ideas?

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sr-iov issues, reset_hw() failed with error -100

2016-03-28 Thread Ultima
Recently upgraded to r297351, once running iovctl -C -f /etc/iovctl.conf
everything appears to work as it should. The main network interface ramains
connected and pinging works fine.

iovctl.conf
PF {
device : ix1;
num_vfs : 31;
}

DEFAULT {
passthrough : true;
}
VF-0 {
passthrough : false;
}
VF-1 {
passthrough : false;
}

Once vf's have been initialized, I have found none-passthrough vf's are
unusable.

# devctl attach pci0:129:0:129
devctl: Failed to attach pci0:129:0:129: Input/output error

The dmesg spits out the same error as before, so it appears that the errors
I was mentioning before is actually the none-passthrough vf's attempting to
attach, but fails.

/var/log/messages:
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version -
1.4.6-k> at device 0.131 on pci12
ixv0: Using MSIX interrupts with 2 vectors
ixv0: ixgbe_reset_hw() failed with error -100
device_attach: ixv0 attach returned 5
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version -
1.4.6-k> at device 0.129 on pci12
ixv0: Using MSIX interrupts with 2 vectors
ixv0: ixgbe_reset_hw() failed with error -100
device_attach: ixv0 attach returned 5

On Wed, Mar 9, 2016 at 5:36 PM, Ultima <ultima1...@gmail.com> wrote:

> I have been interested in this when I first read about it in 2012. =]
>
> Can this be done in loader.conf? I have vmm_load="YES"
>
> I'm not sure if the vf's are usable, I have not actually tested the vf's.
> The parent ix1 still shows no response.
>
> kldunload vmm
>
> kenv hw.vmm.force_iommu=1
>
> kldload vmm
>
> iovctl -Cf /etc/iovctl.conf
>
> The same error messages appear, I currently on hmm i'm not sure, I
> upgraded with git and it doesn't show rev(last time I use git for
> source?) 8b372d1(master)
>
> On Wed, Mar 9, 2016 at 5:00 PM, Eric Joyner <ricer...@gmail.com> wrote:
>
>> I don't know if you're still interested in this, but did you do "kenv
>> hw.vmm.force_iommu=1" before loading the vmm module? I think that might be
>> necessary.
>>
>> On Wed, Feb 24, 2016 at 5:12 PM Ultima <ultima1...@gmail.com> wrote:
>>
>>>  Yeah, still getting the -100 error, I do have sendmail disabled. I just
>>> tested with sendmail up and running then add the VF's and it still shows
>>> the error message.
>>>
>>> On Wed, Feb 24, 2016 at 8:04 PM, Eric Joyner <ricer...@gmail.com> wrote:
>>>
>>>> Are you still getting the -100 errors when trying to load the VF driver?
>>>>
>>>> I've tried SR-IOV on a system here, and I can confirm that traffic
>>>> stops passing on the PF interface when you create a VF interface. That
>>>> didn't used to happen, so I'm investigating why that is right now.
>>>>
>>>> On Wed, Feb 24, 2016 at 8:09 AM Ultima <ultima1...@gmail.com> wrote:
>>>>
>>>>>  Decided to do some more tests, I actually have a second board with
>>>>> sr-iov
>>>>> capabilities that I used for awhile with vmware esxi. I decided to test
>>>>> this out and unfortunately it won't activate, it is giving the no space
>>>>> left on device error message. I double checked bios and all VT-d
>>>>> related
>>>>> options are enabled and have hw.ix.num_queues="4" in
>>>>> /boot/loader.conf. Is
>>>>> there anything else that may need to be set? .(It did work on vmware)
>>>>>
>>>>>  For my second test, I moved the X540-AT1 to the board with the
>>>>> X540-AT2.
>>>>> It functioned with the same issues as the AT2 tho.
>>>>>
>>>>>
>>>>> I don't think I listed the motherboards in question yet so ill list
>>>>> them
>>>>> now.
>>>>>
>>>>> S1200BTLRM -
>>>>>
>>>>> http://ark.intel.com/products/69633/Intel-Server-Board-S1200BTLRM?q=S1200BTLRM
>>>>> MD80-TM0
>>>>> <http://ark.intel.com/products/69633/Intel-Server-Board-S1200BTLRM?q=S1200BTLRMMD80-TM0>
>>>>> - http://b2b.gigabyte.com/products/product-page.aspx?pid=5146#ov
>>>>>
>>>>> I'm not sure if it will be of any help tho.
>>>>>
>>>>> Ultima
>>>>> ___
>>>>> freebsd-current@freebsd.org mailing list
>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>>>> To unsubscribe, send any mail to "
>>>>> freebsd-current-unsubscr...@freebsd.org"
>>>>>
>>>>
>>>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CURRENT slow and shaky network stability

2016-03-26 Thread Ultima
 Having this long timeout issue occur many times during the day.  Normally
not this bad, currently on r297060 amd64. A few hours ago had a system hang
that lasted for about 1-2 hours.(not sure how long exactly, gave up
waiting) It occured after a zfs destroy operation and affected sshd. (could
no longer login via ssh) The system was in a seemingly unusable state
during this time.

 A large zfs send was underway during the outage. This operation was not in
the same state as the system receiving was still growing.

Not sure if its related, sounds like it is. Any more information that maybe
helpful?

On Sat, Mar 26, 2016 at 5:26 PM, Don Lewis  wrote:

> On 26 Mar, Michael Butler wrote:
> > -current is not great for interactive use at all. The strategy of
> > pre-emptively dropping idle processes to swap is hurting .. big time.
> >
> > Compare inactive memory to swap in this example ..
> >
> > 110 processes: 1 running, 108 sleeping, 1 zombie
> > CPU:  1.2% user,  0.0% nice,  4.3% system,  0.0% interrupt, 94.5% idle
> > Mem: 474M Active, 1609M Inact, 764M Wired, 281M Buf, 119M Free
> > Swap: 4096M Total, 917M Used, 3178M Free, 22% Inuse
> >
> >   PID USERNAME   THR PRI NICE   SIZERES STATE   C   TIMEWCPU
> > COMMAND
> >  1819 imb  1  280   213M 11284K select  1 147:44   5.97%
> > gkrellm
> > 59238 imb 43  200   980M   424M select  0  10:07   1.92%
> > firefox
> >
> >  .. it shouldn't start randomly swapping out processes because they're
> > used infrequently when there's more than enough RAM to spare ..
>
> I don't know what changed, and probably something can use some tweaking,
> but paging out idle processes isn't always the wrong thing to do.  For
> instance if I'm using poudriere to build a bunch of packages and its
> heavy use of tmpfs is pushing the machine into many GB of swap usage, I
> don't want interactive use like:
> vi foo.c
> cc foo.c
> vi foo.c
> to suffer because vi and cc have to be read in from a busy hard drive
> each time while unused console getty and idle sshd processes in a bunch
> of jails are still hanging on to memory even though they haven't
> executed any instructions since shortly after the machine was booted
> weeks ago.
>
> > It also shows up when trying to reboot .. on all of my gear, 90 seconds
> > of "fail-safe" time-out is no longer enough when a good proportion of
> > daemons have been dropped onto swap and must be brought back in to flush
> > their data segments :-(
>
> That's a different and known problem.  See:
> <
> https://svnweb.freebsd.org/base/releng/10.3/bin/csh/config_p.h?revision=297204=markup
> >
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current-11 slow

2016-03-19 Thread Ultima
Upgrade your source and reinstall, this issue was fixed sometime around
r296377.

On Sat, Mar 19, 2016 at 10:44 AM, Baho Utot 
wrote:

> Current-11 is slower than FreeBSD-10.2 on my dell lapfart
>
> I fetched and complied current-11 r296326 following the handbook.
> I used a custom kernel config, make.conf and src.conf trying to create a
> "released" install.
>
> Is there any issues with my configuration or do I need to add anything.
>
>
> FreeBSD dell.example.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r296326:
> Sat Mar  5 20:48:00 EST 2016 
> r...@dell.bildanet.com:/usr/obj/usr/src/sys/OCTOTHORPE
> amd64
>
> octothorpe.conf
> include GENERIC
> nocpu   i486_CPU
> ident   OCTOTHORPE
>
> nooptions   INET6
> nooptions   WITNESS
> nooptions   INVARIANTS
>
> nodevicefdc
>
> /etc/src.conf
> MALLOC_PRODUCTION="ON"
> WITHOUT_ASSERT_DEBUG="ON"
> MK_PROFILE=no
>
> /etc/make.conf
> MALLOC_PRODUCTION=TRUE
> KERNCONF=OCTOTHORPE
> MK_PROFILE=no
>
> Thanks
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Significant performance hit on recent upgrade

2016-03-11 Thread Ultima
Just upgraded again to r296709. Running for about an hour, performance
appears to be restored. Will report back if this changes.

On Sat, Mar 12, 2016 at 1:11 AM, Ultima <ultima1...@gmail.com> wrote:

> Hmm, the total cost would be hard to calculate, not all of it was
> purchased at the same time.
>
> CPU - ( E5-2670v3 x 2 ) engineering equivalent, 12 cores 24 virtual -
> price was around 500$ each
>
> Ram - ( 32G x 8 ) kingsten KVR21R15D4K4 256G ram, not cheap around 2200$
>
> Storage - ( (4Tx6 raidz2) x 3 ) 18 drives total (12xst4000nm0033,
> 6xst4000nm0023) with 1 cache device DC S3700 (200GB), I bought all these
> over the course of a few years, so price is not accurate on this. They are
> enterprise drives, so it is prob around 3000$ total. And surprisingly I'v
> had about 5 failed drives over that time. =/
>
> The os drives are 2x128G ssd's mirrored samsung 840's or 850's(can't
> remember)
>
> Motherboard  - Gigabyte MD80-TM0, I am not happy with unfortunately, It
> has many nice features, but I feel like the firmware is a lacking. 500ish?
>
>
>  Overall it is a great box. I am definitely happy with it. I was having a
> little bit of issues with heat at first, I really had to clean up the wires
> for good air flow. Don't think I should have bought so much ram, zfs loves
> it, but just I don't really needed it. However, when everything is in arc,
> it is so fast. -j24 buildworld && buildkernel takes around 20 minutes for
> 10-STABLE and about 30 for head. This system should last me for a few
> years. =]
>
> On Fri, Mar 11, 2016 at 11:33 PM, Kurt Jaeger <p...@opsec.eu> wrote:
>
>> Hi!
>>
>> > last pid: 96327;  load averages: 41.78, 50.32, 50.07up
>> 1+05:19:23
>> >  18:52:47
>> > 169 processes: 11 running, 158 sleeping
>> > CPU: 24.4% user,  0.2% nice, 15.1% system,  0.1% interrupt, 60.3% idle
>> > Mem: 12G Active, 43G Inact, 186G Wired, 8867M Free
>> > ARC: 166G Total, 122G MFU, 38G MRU, 46M Anon, 753M Header, 5314M Other
>> > Swap:
>>
>> That's a nice box, impressive!
>>
>> How much RAM does the box have ? What CPU does it have ? How
>> much did it cost ?
>>
>> --
>> p...@opsec.eu+49 171 3101372 4 years
>> to go !
>>
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Significant performance hit on recent upgrade

2016-03-11 Thread Ultima
Should I restart and provide this info after 20min into build? Or is this
good info?

On Fri, Mar 11, 2016 at 6:56 PM, Ultima <ultima1...@gmail.com> wrote:

> last pid: 96327;  load averages: 41.78, 50.32, 50.07up 1+05:19:23
>  18:52:47
> 169 processes: 11 running, 158 sleeping
> CPU: 24.4% user,  0.2% nice, 15.1% system,  0.1% interrupt, 60.3% idle
> Mem: 12G Active, 43G Inact, 186G Wired, 8867M Free
> ARC: 166G Total, 122G MFU, 38G MRU, 46M Anon, 753M Header, 5314M Other
> Swap:
>
>
> This is currently with poudriere que of 4000ish packages, right now at
> only 116 pkg per hour with 12 jails. Has been going for around 11 hours
> now. Usually it would be done by now with around 350ish pkg per hour with
> other jail versions running, only running the one right now.
>
> On Fri, Mar 11, 2016 at 6:51 PM, Allan Jude <allanj...@freebsd.org> wrote:
>
>> On 03/11/2016 18:49, Ultima wrote:
>> > Hello,
>> >
>> >  Recently I upgraded to r296377, and notice a big hit in performance.
>> This
>> > system in question is a poudriere building system that usually has many
>> > jails running building around 200-400 pkg per hour for each jail. After
>> the
>> > upgrade, it is rare to see over 100 pkg per hour built. Possible
>> regression
>> > on recent commit?
>> >
>> > Previously was running (sorry had git src at the time) 8b372d1(master)
>> > built around a week ago. Any ideas what maybe causing this? I can
>> provide
>> > more details if needed, someone in irc also mentioned noticing a hit in
>> > performance.
>> >
>> >
>> > Ultima
>> > ___
>> > freebsd-current@freebsd.org mailing list
>> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> > To unsubscribe, send any mail to "
>> freebsd-current-unsubscr...@freebsd.org"
>> >
>>
>> Are you using ZFS? What does your memory utilization look like?
>> Can you provide the 'Mem' and 'ARC' lines from top(1) while building
>> packages (ideally, a good 20+ minutes into such a build)?
>>
>> --
>> Allan Jude
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org
>> "
>>
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Significant performance hit on recent upgrade

2016-03-11 Thread Ultima
last pid: 96327;  load averages: 41.78, 50.32, 50.07up 1+05:19:23
 18:52:47
169 processes: 11 running, 158 sleeping
CPU: 24.4% user,  0.2% nice, 15.1% system,  0.1% interrupt, 60.3% idle
Mem: 12G Active, 43G Inact, 186G Wired, 8867M Free
ARC: 166G Total, 122G MFU, 38G MRU, 46M Anon, 753M Header, 5314M Other
Swap:


This is currently with poudriere que of 4000ish packages, right now at only
116 pkg per hour with 12 jails. Has been going for around 11 hours now.
Usually it would be done by now with around 350ish pkg per hour with other
jail versions running, only running the one right now.

On Fri, Mar 11, 2016 at 6:51 PM, Allan Jude <allanj...@freebsd.org> wrote:

> On 03/11/2016 18:49, Ultima wrote:
> > Hello,
> >
> >  Recently I upgraded to r296377, and notice a big hit in performance.
> This
> > system in question is a poudriere building system that usually has many
> > jails running building around 200-400 pkg per hour for each jail. After
> the
> > upgrade, it is rare to see over 100 pkg per hour built. Possible
> regression
> > on recent commit?
> >
> > Previously was running (sorry had git src at the time) 8b372d1(master)
> > built around a week ago. Any ideas what maybe causing this? I can provide
> > more details if needed, someone in irc also mentioned noticing a hit in
> > performance.
> >
> >
> > Ultima
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "
> freebsd-current-unsubscr...@freebsd.org"
> >
>
> Are you using ZFS? What does your memory utilization look like?
> Can you provide the 'Mem' and 'ARC' lines from top(1) while building
> packages (ideally, a good 20+ minutes into such a build)?
>
> --
> Allan Jude
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Significant performance hit on recent upgrade

2016-03-11 Thread Ultima
Hello,

 Recently I upgraded to r296377, and notice a big hit in performance. This
system in question is a poudriere building system that usually has many
jails running building around 200-400 pkg per hour for each jail. After the
upgrade, it is rare to see over 100 pkg per hour built. Possible regression
on recent commit?

Previously was running (sorry had git src at the time) 8b372d1(master)
built around a week ago. Any ideas what maybe causing this? I can provide
more details if needed, someone in irc also mentioned noticing a hit in
performance.


Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfsboot patch for /usr

2016-03-10 Thread Ultima
 No need to ignore the output of zfs list, here is an explanation of zfs
list's output, this is in the zfs man page. I think you're confusing used
with refer.

used
 The amount of space consumed by this dataset and all its
descendents.
 This is the value that is checked against this dataset's quota and
 reservation. The space used does not include this dataset's
reserva‐
 tion, but does take into account the reservations of any descendent
 datasets. The amount of space that a dataset consumes from its par‐
 ent, as well as the amount of space that are freed if this dataset
is
 recursively destroyed, is the greater of its space used and its
 reservation.

 When snapshots (see the "Snapshots" section) are created, their
space
 is initially shared between the snapshot and the file system, and
 possibly with previous snapshots. As the file system changes, space
 that was previously shared becomes unique to the snapshot, and
 counted in the snapshot's space used. Additionally, deleting snap‐
 shots can increase the amount of space unique to (and used by)
other
 snapshots.

 The amount of space used, available, or referenced does not take
into
 account pending changes. Pending changes are generally accounted
for
 within a few seconds. Committing a change to a disk using fsync(2)
or
 O_SYNC does not necessarily guarantee that the space usage informa‐
 tion is updated immediately.

available
 The amount of space available to the dataset and all its children,
 assuming that there is no other activity in the pool. Because space
 is shared within a pool, availability can be limited by any number
of
 factors, including physical pool size, quotas, reservations, or
other
 datasets within the pool.

 This property can also be referred to by its shortened column name,
 avail.

referenced
 The amount of data that is accessible by this dataset, which may or
 may not be shared with other datasets in the pool. When a snapshot
or
 clone is created, it initially references the same amount of space
as
 the file system or snapshot it was created from, since its contents
 are identical.

 This property can also be referred to by its shortened column name,
 refer.



On Thu, Mar 10, 2016 at 7:06 PM, Roger Marquis  wrote:

> > You don't mkdir it, you create it as a ZFS dataset, and mark it with the
> > 'canmount=no' property, so it only exists to be a parent, not as an
> > actual dataset. This is the default in zfboot currently.
>
> Thanks to everyone for pointing this out.  I'll forget about mkdir then,
> ignore the output of 'zfs list' and get comfortable doing things the zfs
> way.
>
> Still have to tweak scripts/zfsboot to create a /var/spool subvol, a /home
> subvol in place of /usr/home and specify atime=none in the default dataset.
> At least the latter works as expected.
>
> Roger
>
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: zfsboot patch for /usr

2016-03-09 Thread Ultima
 Zfs automatically creates directories needed to be mounted, it would be a
wasted effort making directories for zfs. I don't really understand your
issue with an unused /usr dataset, but you can just modify
the ZFSBOOT_DATASETS area of the script to get accomplish what you're
trying to do.

On Wed, Mar 9, 2016 at 8:02 PM, Freddie Cash  wrote:

> On Mar 9, 2016 4:04 PM, "Miroslav Lachman" <000.f...@quip.cz> wrote:
> >
> > Roger Marquis wrote on 03/10/2016 00:36:
> >>
> >> Wondering if anyone has example patches for zfsboot (from
> >> usr.sbin/bsdinstall/scripts)?
> >>
> >> We're looking to change some of the default zfs subvolumes, removing
> /usr in
> >> favor of /usr/local in particular, and have run into a "parent does not
> exist"
> >> issue.  It's not clear where in the script the /usr parent dir should be
> >> mkdir'd.
> >
> >
> > I no nothing about this script but if you want /usr/local as ZFS
> filesystem, then you need to create parent (/usr in this case) and you can
> use property canmount=off plus different 'mountpoint' (for example
> /mnt/usr) to not mount /usr over existing directory on root filesystem.
>
> Set mountpoint=none if you just want to create the parent dataset without
> actually using it for storage. Then you can set properties on it, and child
> datasets will inherit then. Like pool/usr/local
>
> You'd still need to "mkdir /usr" in the script, but that's separate.
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sr-iov issues, reset_hw() failed with error -100

2016-03-09 Thread Ultima
I have been interested in this when I first read about it in 2012. =]

Can this be done in loader.conf? I have vmm_load="YES"

I'm not sure if the vf's are usable, I have not actually tested the vf's.
The parent ix1 still shows no response.

kldunload vmm

kenv hw.vmm.force_iommu=1

kldload vmm

iovctl -Cf /etc/iovctl.conf

The same error messages appear, I currently on hmm i'm not sure, I upgraded
with git and it doesn't show rev(last time I use git for
source?) 8b372d1(master)

On Wed, Mar 9, 2016 at 5:00 PM, Eric Joyner <ricer...@gmail.com> wrote:

> I don't know if you're still interested in this, but did you do "kenv
> hw.vmm.force_iommu=1" before loading the vmm module? I think that might be
> necessary.
>
> On Wed, Feb 24, 2016 at 5:12 PM Ultima <ultima1...@gmail.com> wrote:
>
>>  Yeah, still getting the -100 error, I do have sendmail disabled. I just
>> tested with sendmail up and running then add the VF's and it still shows
>> the error message.
>>
>> On Wed, Feb 24, 2016 at 8:04 PM, Eric Joyner <ricer...@gmail.com> wrote:
>>
>>> Are you still getting the -100 errors when trying to load the VF driver?
>>>
>>> I've tried SR-IOV on a system here, and I can confirm that traffic stops
>>> passing on the PF interface when you create a VF interface. That didn't
>>> used to happen, so I'm investigating why that is right now.
>>>
>>> On Wed, Feb 24, 2016 at 8:09 AM Ultima <ultima1...@gmail.com> wrote:
>>>
>>>>  Decided to do some more tests, I actually have a second board with
>>>> sr-iov
>>>> capabilities that I used for awhile with vmware esxi. I decided to test
>>>> this out and unfortunately it won't activate, it is giving the no space
>>>> left on device error message. I double checked bios and all VT-d related
>>>> options are enabled and have hw.ix.num_queues="4" in /boot/loader.conf.
>>>> Is
>>>> there anything else that may need to be set? .(It did work on vmware)
>>>>
>>>>  For my second test, I moved the X540-AT1 to the board with the
>>>> X540-AT2.
>>>> It functioned with the same issues as the AT2 tho.
>>>>
>>>>
>>>> I don't think I listed the motherboards in question yet so ill list them
>>>> now.
>>>>
>>>> S1200BTLRM -
>>>>
>>>> http://ark.intel.com/products/69633/Intel-Server-Board-S1200BTLRM?q=S1200BTLRM
>>>> MD80-TM0
>>>> <http://ark.intel.com/products/69633/Intel-Server-Board-S1200BTLRM?q=S1200BTLRMMD80-TM0>
>>>> - http://b2b.gigabyte.com/products/product-page.aspx?pid=5146#ov
>>>>
>>>> I'm not sure if it will be of any help tho.
>>>>
>>>> Ultima
>>>> ___
>>>> freebsd-current@freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>>> To unsubscribe, send any mail to "
>>>> freebsd-current-unsubscr...@freebsd.org"
>>>>
>>>
>>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: error: unknown type name 'd_thread_t'

2016-03-07 Thread Ultima
Hello chris,

d_thread_t was a compatible shim to support FreeBSD 4. It was removed in
current some time ago, changing to struct thread should fix this error.

Ultima

On Mon, Mar 7, 2016 at 8:05 PM, Chris H <bsd-li...@bsdforge.com> wrote:

> Greetings, all.
> Apologies in advance, if this is better suited for
> freebsd-hackers@. But given this is only relevant to CURRENT,
> I hoped it would be OK.
>
> OK. I'm attempting to build an i386 development box on -CURRENT.
> I'm stuck using a legacy nvidia card (NV-34). Yea, I know. But
> that's what I have. Anyway, that necessitates my maintaining a
> local copy of the now defunct x11/nvidia-driver-173 port.
> I've cobbled/refined all the necessary patches; save one.
> Which is what beings me here. It appears that the d_thread_t
> compatibility shim provided in 5.0 was dumped in r277897.
> Sadly, as a result I receive the following, when attempting
> to build the port (in spite of having COMPAT_FREEBSD5 built
> in to my custom kernel):
>
>
> /usr/ports/x11/nvidia-driver-173/work/NVIDIA-FreeBSD-x86-173.14.39/src/nv-freebs
> d.h:459:68: error: unknown type name 'd_thread_t'
> intnvidia_handle_ioctl   (struct cdev *, u_long, caddr_t, int,
> d_thread_t
> *)
> ;
>^
>
> /usr/ports/x11/nvidia-driver-173/work/NVIDIA-FreeBSD-x86-173.14.39/src/nv-freebs
> d.h:463:46: error: unknown type name 'd_thread_t'
> intnvidia_open_ctl   (struct cdev *, d_thread_t *);
>  ^
>
> /usr/ports/x11/nvidia-driver-173/work/NVIDIA-FreeBSD-x86-173.14.39/src/nv-freebs
> d.h:464:69: error: unknown type name 'd_thread_t'
> intnvidia_open_dev   (struct nvidia_softc *, struct cdev *,
> d_thread_t
> *
> );
> ^
>
> /usr/ports/x11/nvidia-driver-173/work/NVIDIA-FreeBSD-x86-173.14.39/src/nv-freebs
> d.h:465:46: error: unknown type name 'd_thread_t'
> intnvidia_close_ctl  (struct cdev *, d_thread_t *);
>  ^
>
> /usr/ports/x11/nvidia-driver-173/work/NVIDIA-FreeBSD-x86-173.14.39/src/nv-freebs
> d.h:466:69: error: unknown type name 'd_thread_t'
> intnvidia_close_dev  (struct nvidia_softc *, struct cdev *,
> d_thread_t
> *
> );
>
> Is there any way around this?
>
> Thanks for any, and all help with this!
>
> --Chris
>
> --
>
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sr-iov issues, reset_hw() failed with error -100

2016-02-24 Thread Ultima
 Yeah, still getting the -100 error, I do have sendmail disabled. I just
tested with sendmail up and running then add the VF's and it still shows
the error message.

On Wed, Feb 24, 2016 at 8:04 PM, Eric Joyner <ricer...@gmail.com> wrote:

> Are you still getting the -100 errors when trying to load the VF driver?
>
> I've tried SR-IOV on a system here, and I can confirm that traffic stops
> passing on the PF interface when you create a VF interface. That didn't
> used to happen, so I'm investigating why that is right now.
>
> On Wed, Feb 24, 2016 at 8:09 AM Ultima <ultima1...@gmail.com> wrote:
>
>>  Decided to do some more tests, I actually have a second board with sr-iov
>> capabilities that I used for awhile with vmware esxi. I decided to test
>> this out and unfortunately it won't activate, it is giving the no space
>> left on device error message. I double checked bios and all VT-d related
>> options are enabled and have hw.ix.num_queues="4" in /boot/loader.conf. Is
>> there anything else that may need to be set? .(It did work on vmware)
>>
>>  For my second test, I moved the X540-AT1 to the board with the X540-AT2.
>> It functioned with the same issues as the AT2 tho.
>>
>>
>> I don't think I listed the motherboards in question yet so ill list them
>> now.
>>
>> S1200BTLRM -
>>
>> http://ark.intel.com/products/69633/Intel-Server-Board-S1200BTLRM?q=S1200BTLRM
>> MD80-TM0
>> <http://ark.intel.com/products/69633/Intel-Server-Board-S1200BTLRM?q=S1200BTLRMMD80-TM0>
>> - http://b2b.gigabyte.com/products/product-page.aspx?pid=5146#ov
>>
>> I'm not sure if it will be of any help tho.
>>
>> Ultima
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org
>> "
>>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sr-iov issues, reset_hw() failed with error -100

2016-02-24 Thread Ultima
 Decided to do some more tests, I actually have a second board with sr-iov
capabilities that I used for awhile with vmware esxi. I decided to test
this out and unfortunately it won't activate, it is giving the no space
left on device error message. I double checked bios and all VT-d related
options are enabled and have hw.ix.num_queues="4" in /boot/loader.conf. Is
there anything else that may need to be set? .(It did work on vmware)

 For my second test, I moved the X540-AT1 to the board with the X540-AT2.
It functioned with the same issues as the AT2 tho.


I don't think I listed the motherboards in question yet so ill list them
now.

S1200BTLRM -
http://ark.intel.com/products/69633/Intel-Server-Board-S1200BTLRM?q=S1200BTLRM
MD80-TM0 - http://b2b.gigabyte.com/products/product-page.aspx?pid=5146#ov

I'm not sure if it will be of any help tho.

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sr-iov issues, reset_hw() failed with error -100

2016-02-23 Thread Ultima
 Upgraded to r295920 and used generic kernel. Then rebooted and checked
bios thoroughly. I found 3, yes 3 different areas in the bios for enabling
sr-iov (some screen shots below). 2 are for vt-d(Forgot to take a
screenshot of the 2nt, all were enabled) and one for sr-iov(was disabled).
Unfortunately with generic kernel, and all these options enabled, adding
only 2vf's resulted in the same behavior as above.

 Little bit of good news, when removing the vf's and ifconfig ix1 down/up
the interface's functionality is restored. I also tested this on MYKERNEL
r295920, sr-iov option in bios likely played a role in this.

 If you have any ideas, I'm willing to test them.

Thanks! =]

VT-d screen: https://puu.sh/niyiq/4fee92e4a3.jpgn
PCI advanced options: https://puu.sh/niyjg/88e71e48d9.jpgn

Ultima

On Mon, Feb 22, 2016 at 9:46 PM, Ultima <ultima1...@gmail.com> wrote:

> I forgot to mention my kernel conf I'm not sure if it would cause this
> issue, but I'll test again with GENERIC.
>
> --- /usr/src/sys/amd64/conf/GENERIC 2016-02-22 21:05:37.152953000 -0500
> +++ /root/MYKERNEL-11-CURRENT-AMD64 2015-12-28 19:18:22.893391452 -0500
> @@ -91,6 +91,12 @@
>  options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed
>  options MALLOC_DEBUG_MAXZONES=8 # Separate malloc(9) zones
>
> +### VIMAGE ###
> +options VIMAGE
> +
> +### ROUTE TABLES ###
> +options ROUTETABLES=2
> +
>  # Make an SMP-capable kernel by default
>  options SMP # Symmetric MultiProcessor Kernel
>
>  The interface is just dead after creating the vf's earlier. Recreating
> with only 2, its still dead.
>
> Running out of ideas, decided to try a tcpdump...
>
> # dhclient ix1
> DHCPREQUEST on ix1 to 255.255.255.255 port 67
> DHCPREQUEST on ix1 to 255.255.255.255 port 67
> DHCPREQUEST on ix1 to 255.255.255.255 port 67
> DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 7
> DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 13
> DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 14
>
> # tcpdump -i ix1
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on ix1, link-type EN10MB (Ethernet), capture size 262144 bytes
> 21:41:13.234688 IP 192.168.1.145.bootpc > 255.255.255.255.bootps:
> BOOTP/DHCP, Request from -Hidden- (oui Unknown), length 300
> 21:41:16.236671 IP 192.168.1.145.bootpc > 255.255.255.255.bootps:
> BOOTP/DHCP, Request from -Hidden- (oui Unknown), length 300
> 21:41:23.243242 IP 192.168.1.145.bootpc > 255.255.255.255.bootps:
> BOOTP/DHCP, Request from -Hidden- (oui Unknown), length 300
> 21:41:38.261015 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
> Request from -Hidden- (oui Unknown), length 300
> 21:41:45.284752 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
> Request from -Hidden- (oui Unknown), length 300
> 21:41:58.292223 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
> Request from -Hidden- (oui Unknown), length 300
>
> I don't think this is that helpful tho. =/
> tcpdump on the other end, the packets are never received.
>
> I'v already double checked VT-d, but I'll make it tripple! =] I can't
> reset the system right now, but ill send an update when possible with only
> 2 vf's and unmodified generic.
>
> Ultima
>
> On Mon, Feb 22, 2016 at 8:52 PM, Eric Joyner <e...@freebsd.org> wrote:
>
>> I don't really have any ideas on the error -100. Error -100 means there
>> was a mailbox error, so something failed in the initial communications
>> setup between the PF and VF, but I don't know what exactly went wrong.
>>
>> I'm grasping at straws, but try using a smaller number of VFs initially,
>> like 2? And check to see if VT-d is enabled in your BIOS? (Though I
>> would've expected iovctl to fail).
>>
>> - Eric
>>
>>
>> On Mon, Feb 22, 2016 at 12:01 PM Ultima <ultima1...@gmail.com> wrote:
>>
>>> After reboot...
>>>
>>> ifconfig ix1 up
>>>
>>> dhclient ix1
>>> DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 4
>>> DHCPOFFER from 192.168.1.1
>>> DHCPREQUEST on ix1 to 255.255.255.255 port 67
>>> DHCPACK from 192.168.1.1
>>> bound to 192.168.1.145 -- renewal in 21600 seconds.
>>>
>>> ix0 down
>>> ping 192.168.1.1
>>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>>> 64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=0.149 ms
>>> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.171 ms
>>> 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.167 ms
>>>
>>> iovctl -Cf /etc/iovctl.conf
>>>
>>> ping 192.168.1.1
>>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>

Re: sr-iov issues, reset_hw() failed with error -100

2016-02-22 Thread Ultima
I forgot to mention my kernel conf I'm not sure if it would cause this
issue, but I'll test again with GENERIC.

--- /usr/src/sys/amd64/conf/GENERIC 2016-02-22 21:05:37.152953000 -0500
+++ /root/MYKERNEL-11-CURRENT-AMD64 2015-12-28 19:18:22.893391452 -0500
@@ -91,6 +91,12 @@
 options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed
 options MALLOC_DEBUG_MAXZONES=8 # Separate malloc(9) zones

+### VIMAGE ###
+options VIMAGE
+
+### ROUTE TABLES ###
+options ROUTETABLES=2
+
 # Make an SMP-capable kernel by default
 options SMP # Symmetric MultiProcessor Kernel

 The interface is just dead after creating the vf's earlier. Recreating
with only 2, its still dead.

Running out of ideas, decided to try a tcpdump...

# dhclient ix1
DHCPREQUEST on ix1 to 255.255.255.255 port 67
DHCPREQUEST on ix1 to 255.255.255.255 port 67
DHCPREQUEST on ix1 to 255.255.255.255 port 67
DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 7
DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 13
DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 14

# tcpdump -i ix1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ix1, link-type EN10MB (Ethernet), capture size 262144 bytes
21:41:13.234688 IP 192.168.1.145.bootpc > 255.255.255.255.bootps:
BOOTP/DHCP, Request from -Hidden- (oui Unknown), length 300
21:41:16.236671 IP 192.168.1.145.bootpc > 255.255.255.255.bootps:
BOOTP/DHCP, Request from -Hidden- (oui Unknown), length 300
21:41:23.243242 IP 192.168.1.145.bootpc > 255.255.255.255.bootps:
BOOTP/DHCP, Request from -Hidden- (oui Unknown), length 300
21:41:38.261015 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
Request from -Hidden- (oui Unknown), length 300
21:41:45.284752 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
Request from -Hidden- (oui Unknown), length 300
21:41:58.292223 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
Request from -Hidden- (oui Unknown), length 300

I don't think this is that helpful tho. =/
tcpdump on the other end, the packets are never received.

I'v already double checked VT-d, but I'll make it tripple! =] I can't reset
the system right now, but ill send an update when possible with only 2 vf's
and unmodified generic.

Ultima

On Mon, Feb 22, 2016 at 8:52 PM, Eric Joyner <e...@freebsd.org> wrote:

> I don't really have any ideas on the error -100. Error -100 means there
> was a mailbox error, so something failed in the initial communications
> setup between the PF and VF, but I don't know what exactly went wrong.
>
> I'm grasping at straws, but try using a smaller number of VFs initially,
> like 2? And check to see if VT-d is enabled in your BIOS? (Though I
> would've expected iovctl to fail).
>
> - Eric
>
>
> On Mon, Feb 22, 2016 at 12:01 PM Ultima <ultima1...@gmail.com> wrote:
>
>> After reboot...
>>
>> ifconfig ix1 up
>>
>> dhclient ix1
>> DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 4
>> DHCPOFFER from 192.168.1.1
>> DHCPREQUEST on ix1 to 255.255.255.255 port 67
>> DHCPACK from 192.168.1.1
>> bound to 192.168.1.145 -- renewal in 21600 seconds.
>>
>> ix0 down
>> ping 192.168.1.1
>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>> 64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=0.149 ms
>> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.171 ms
>> 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.167 ms
>>
>> iovctl -Cf /etc/iovctl.conf
>>
>> ping 192.168.1.1
>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>> ^C
>> --- 192.168.1.1 ping statistics ---
>> 29 packets transmitted, 0 packets received, 100.0% packet loss
>> ifconfig ix1 up
>> ping 192.168.1.1
>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>> ^C
>> --- 192.168.1.1 ping statistics ---
>> 12 packets transmitted, 0 packets received, 100.0% packet loss
>>
>> ix1 is no longer usable until a restart...
>>
>> iovctl -Dd ix1
>> ifconfig ix1 up
>> ping 192.168.1.1
>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>> ^C
>> --- 192.168.1.1 ping statistics ---
>> 9 packets transmitted, 0 packets received, 100.0% packet loss
>>
>>
>>
>> Is there anything else that maybe useful?
>>
>> here is my ifconfig at the end (after ifconfig ix0 up)
>>
>>
>> ix0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0
>> mtu 1500
>>
>> options=e400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
>> ether -Hidden-
>> inet 192.168.1.8 netmask 0xff00 broadcast 192.168.1.255
>> inet 192.168.1.9 netmask 0xff00 broadcast 192.168.1.255
>> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKL

Re: sr-iov issues, reset_hw() failed with error -100

2016-02-22 Thread Ultima
Yeah, dmesg does show 48. 12 Cores and 24 threads each. 48 total

On Mon, Feb 22, 2016 at 4:35 PM, Steven Hartland <kill...@multiplay.co.uk>
wrote:

> isn't that 48 cores (12 real 12 virtual) per CPU?
>
>
> On 22/02/2016 21:26, Ultima wrote:
>
>> This system has 24 cores (e5-2670v3)x2
>>
>> Ultima
>>
>> On Mon, Feb 22, 2016 at 3:53 PM, Pieper, Jeffrey E <
>> jeffrey.e.pie...@intel.com> wrote:
>>
>> Just out of curiosity, how many cores does your system have?
>>>
>>> Jeff
>>>
>>> -Original Message-
>>> From: owner-freebsd-curr...@freebsd.org [mailto:
>>> owner-freebsd-curr...@freebsd.org] On Behalf Of Ultima
>>> Sent: Monday, February 22, 2016 12:02 PM
>>> To: Eric Joyner <e...@freebsd.org>
>>> Cc: freebsd-current@freebsd.org; freebsd-virtualizat...@freebsd.org
>>> Subject: Re: sr-iov issues, reset_hw() failed with error -100
>>>
>>> After reboot...
>>>
>>> ifconfig ix1 up
>>>
>>> dhclient ix1
>>> DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 4
>>> DHCPOFFER from 192.168.1.1
>>> DHCPREQUEST on ix1 to 255.255.255.255 port 67
>>> DHCPACK from 192.168.1.1
>>> bound to 192.168.1.145 -- renewal in 21600 seconds.
>>>
>>> ix0 down
>>> ping 192.168.1.1
>>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>>> 64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=0.149 ms
>>> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.171 ms
>>> 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.167 ms
>>>
>>> iovctl -Cf /etc/iovctl.conf
>>>
>>> ping 192.168.1.1
>>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>>> ^C
>>> --- 192.168.1.1 ping statistics ---
>>> 29 packets transmitted, 0 packets received, 100.0% packet loss
>>> ifconfig ix1 up
>>> ping 192.168.1.1
>>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>>> ^C
>>> --- 192.168.1.1 ping statistics ---
>>> 12 packets transmitted, 0 packets received, 100.0% packet loss
>>>
>>> ix1 is no longer usable until a restart...
>>>
>>> iovctl -Dd ix1
>>> ifconfig ix1 up
>>> ping 192.168.1.1
>>> PING 192.168.1.1 (192.168.1.1): 56 data bytes
>>> ^C
>>> --- 192.168.1.1 ping statistics ---
>>> 9 packets transmitted, 0 packets received, 100.0% packet loss
>>>
>>>
>>>
>>> Is there anything else that maybe useful?
>>>
>>> here is my ifconfig at the end (after ifconfig ix0 up)
>>>
>>>
>>> ix0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0
>>> mtu 1500
>>>
>>>
>>> options=e400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
>>> ether -Hidden-
>>> inet 192.168.1.8 netmask 0xff00 broadcast 192.168.1.255
>>> inet 192.168.1.9 netmask 0xff00 broadcast 192.168.1.255
>>> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>> media: Ethernet autoselect (10Gbase-T <full-duplex,rxpause,txpause>)
>>> status: active
>>> ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>>>
>>>
>>> options=e407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
>>> ether -Hidden-
>>> inet 192.168.1.145 netmask 0xff00 broadcast 192.168.1.255
>>> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>> media: Ethernet autoselect (10Gbase-T <full-duplex,rxpause,txpause>)
>>> status: active
>>> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>>> options=63<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>>> inet6 ::1 prefixlen 128
>>> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
>>> inet 127.0.0.1 netmask 0xff00
>>> nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>>> groups: lo
>>> bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
>>> 1500
>>> ether -Hidden-
>>> nd6 options=9<PERFORMNUD,IFDISABLED>
>>> groups: bridge
>>> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
>>> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
>>> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
>>> member: ix0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
>>> ifmaxaddr 0 port 1 priorit

Re: sr-iov issues, reset_hw() failed with error -100

2016-02-22 Thread Ultima
This system has 24 cores (e5-2670v3)x2

Ultima

On Mon, Feb 22, 2016 at 3:53 PM, Pieper, Jeffrey E <
jeffrey.e.pie...@intel.com> wrote:

> Just out of curiosity, how many cores does your system have?
>
> Jeff
>
> -Original Message-
> From: owner-freebsd-curr...@freebsd.org [mailto:
> owner-freebsd-curr...@freebsd.org] On Behalf Of Ultima
> Sent: Monday, February 22, 2016 12:02 PM
> To: Eric Joyner <e...@freebsd.org>
> Cc: freebsd-current@freebsd.org; freebsd-virtualizat...@freebsd.org
> Subject: Re: sr-iov issues, reset_hw() failed with error -100
>
> After reboot...
>
> ifconfig ix1 up
>
> dhclient ix1
> DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 4
> DHCPOFFER from 192.168.1.1
> DHCPREQUEST on ix1 to 255.255.255.255 port 67
> DHCPACK from 192.168.1.1
> bound to 192.168.1.145 -- renewal in 21600 seconds.
>
> ix0 down
> ping 192.168.1.1
> PING 192.168.1.1 (192.168.1.1): 56 data bytes
> 64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=0.149 ms
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.171 ms
> 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.167 ms
>
> iovctl -Cf /etc/iovctl.conf
>
> ping 192.168.1.1
> PING 192.168.1.1 (192.168.1.1): 56 data bytes
> ^C
> --- 192.168.1.1 ping statistics ---
> 29 packets transmitted, 0 packets received, 100.0% packet loss
> ifconfig ix1 up
> ping 192.168.1.1
> PING 192.168.1.1 (192.168.1.1): 56 data bytes
> ^C
> --- 192.168.1.1 ping statistics ---
> 12 packets transmitted, 0 packets received, 100.0% packet loss
>
> ix1 is no longer usable until a restart...
>
> iovctl -Dd ix1
> ifconfig ix1 up
> ping 192.168.1.1
> PING 192.168.1.1 (192.168.1.1): 56 data bytes
> ^C
> --- 192.168.1.1 ping statistics ---
> 9 packets transmitted, 0 packets received, 100.0% packet loss
>
>
>
> Is there anything else that maybe useful?
>
> here is my ifconfig at the end (after ifconfig ix0 up)
>
>
> ix0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0
> mtu 1500
>
> options=e400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> ether -Hidden-
> inet 192.168.1.8 netmask 0xff00 broadcast 192.168.1.255
> inet 192.168.1.9 netmask 0xff00 broadcast 192.168.1.255
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> media: Ethernet autoselect (10Gbase-T <full-duplex,rxpause,txpause>)
> status: active
> ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> ether -Hidden-
> inet 192.168.1.145 netmask 0xff00 broadcast 192.168.1.255
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> media: Ethernet autoselect (10Gbase-T <full-duplex,rxpause,txpause>)
> status: active
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
> options=63<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
> inet6 ::1 prefixlen 128
> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
> inet 127.0.0.1 netmask 0xff00
> nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> groups: lo
> bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
> 1500
> ether -Hidden-
> nd6 options=9<PERFORMNUD,IFDISABLED>
> groups: bridge
> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
> member: ix0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
>ifmaxaddr 0 port 1 priority 128 path cost 2000
> member: epair0a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
>ifmaxaddr 0 port 5 priority 128 path cost 2000
> epair0a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric
> 0 mtu 1500
> options=8
> ether -Hidden-
> inet6 fe80::ff:70ff:fe00:50a%epair0a prefixlen 64 scopeid 0x5
> nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> media: Ethernet 10Gbase-T (10Gbase-T )
> status: active
> groups: epair
>
> On Mon, Feb 22, 2016 at 1:51 PM, Eric Joyner <e...@freebsd.org> wrote:
>
> > Did you do an ifconfig up on ix1 before loading the VF driver?
> >
> > On Sat, Feb 20, 2016 at 11:57 AM Ultima <ultima1...@gmail.com> wrote:
> >
> >>  Decided to do some testing with iovctl to see how sr-iov is coming
> along.
> >> Currently when adding the vf's there are a couple errors, and the
> network
> >> no longer function after iovctl is started. My guess is the reset_hw()
> >> call
> >> that is failing. Any ideas why this 

Re: sr-iov issues, reset_hw() failed with error -100

2016-02-22 Thread Ultima
After reboot...

ifconfig ix1 up

dhclient ix1
DHCPDISCOVER on ix1 to 255.255.255.255 port 67 interval 4
DHCPOFFER from 192.168.1.1
DHCPREQUEST on ix1 to 255.255.255.255 port 67
DHCPACK from 192.168.1.1
bound to 192.168.1.145 -- renewal in 21600 seconds.

ix0 down
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=0.149 ms
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.171 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.167 ms

iovctl -Cf /etc/iovctl.conf

ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
29 packets transmitted, 0 packets received, 100.0% packet loss
ifconfig ix1 up
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
12 packets transmitted, 0 packets received, 100.0% packet loss

ix1 is no longer usable until a restart...

iovctl -Dd ix1
ifconfig ix1 up
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
9 packets transmitted, 0 packets received, 100.0% packet loss



Is there anything else that maybe useful?

here is my ifconfig at the end (after ifconfig ix0 up)


ix0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0
mtu 1500
options=e400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether -Hidden-
inet 192.168.1.8 netmask 0xff00 broadcast 192.168.1.255
inet 192.168.1.9 netmask 0xff00 broadcast 192.168.1.255
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-T <full-duplex,rxpause,txpause>)
status: active
ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=e407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether -Hidden-
inet 192.168.1.145 netmask 0xff00 broadcast 192.168.1.255
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-T <full-duplex,rxpause,txpause>)
status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=63<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet 127.0.0.1 netmask 0xff00
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: lo
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
1500
ether -Hidden-
nd6 options=9<PERFORMNUD,IFDISABLED>
groups: bridge
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: ix0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
   ifmaxaddr 0 port 1 priority 128 path cost 2000
member: epair0a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
   ifmaxaddr 0 port 5 priority 128 path cost 2000
epair0a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric
0 mtu 1500
options=8
ether -Hidden-
inet6 fe80::ff:70ff:fe00:50a%epair0a prefixlen 64 scopeid 0x5
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet 10Gbase-T (10Gbase-T )
status: active
groups: epair

On Mon, Feb 22, 2016 at 1:51 PM, Eric Joyner <e...@freebsd.org> wrote:

> Did you do an ifconfig up on ix1 before loading the VF driver?
>
> On Sat, Feb 20, 2016 at 11:57 AM Ultima <ultima1...@gmail.com> wrote:
>
>>  Decided to do some testing with iovctl to see how sr-iov is coming along.
>> Currently when adding the vf's there are a couple errors, and the network
>> no longer function after iovctl is started. My guess is the reset_hw()
>> call
>> that is failing. Any ideas why this call would fail? I tested this on both
>> ports, ix1 is detached and unused for this test, however inserting a cable
>> results in an unusable port. iovctl -Dd ix1 removes the vf's, however
>> functionality is still not restored without a system restart.
>>
>> FreeBSD S1 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r295736: Wed Feb 17
>> 21:17:28 EST 2016 root@S1:/usr/obj/usr/src/sys/MYKERNEL  amd64
>>
>> /boot/loader.conf
>> hw.ix.num_queues="4"
>>
>> /etc/iovctl.conf
>> PF {
>> device : ix1;
>> num_vfs : 31;
>> }
>>
>> DEFAULT {
>> passthrough : true;
>> }
>> VF-0 {
>> passthrough : false;
>> }
>> VF-1 {
>> passthrough : false;
>> }
>>
>> # iovctl -C -f /etc/iovctl.conf
>>
>> dmesg
>> ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version -
>> 1.4.6-k> at device 0.129 on pci12
>> ixv0: Using MSIX interrupts with 2 vectors
>> ixv0: ixgbe_reset_hw() failed with error -100
&

sr-iov issues, reset_hw() failed with error -100

2016-02-20 Thread Ultima
device = 'X540 Ethernet Controller Virtual Function'
class  = network
subclass   = ethernet
ppt25@pci0:129:0:183:   class=0x02 card=0x1458 chip=0x15158086
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'X540 Ethernet Controller Virtual Function'
class  = network
subclass   = ethernet
ppt26@pci0:129:0:185:   class=0x02 card=0x1458 chip=0x15158086
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'X540 Ethernet Controller Virtual Function'
class  = network
subclass   = ethernet
ppt27@pci0:129:0:187:   class=0x02 card=0x1458 chip=0x15158086
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'X540 Ethernet Controller Virtual Function'
class  = network
subclass   = ethernet
ppt28@pci0:129:0:189:   class=0x02 card=0x00001458 chip=0x15158086
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'X540 Ethernet Controller Virtual Function'
class  = network
subclass   = ethernet

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r294195: Kernel panic during installworld

2016-02-11 Thread Ultima
This is actually triggered when using large blocks.

"recordsize=1M"

going back to 128k and I no longer panic.

On Thu, Feb 11, 2016 at 9:10 AM, Daniel Nebdal  wrote:

> On Thu, Feb 11, 2016 at 3:04 PM, Hans Petter Selasky 
> wrote:
> > On 02/11/16 15:00, Daniel Nebdal wrote:
> >>
> >> plugging in a
> >> USB keyboard post-panic didn't do much
> >
> >
> > Hi,
> >
> > USB enumeration is disabled in the debugger. You need to plug it in
> > pre-crash :-) Same with any USB crash dump device(s).
> >
> > --HPS
>
> Myeah, I didn't really expect it to work - at least I left it plugged in
> now. :)
>
> --
> Daniel Nebdal
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel memory leak with x11/nvidia-driver

2016-02-04 Thread Ultima
 Wow that is insane. I'm going to start looking for the
revision this behavior started. If you already found it,
or find it before I report back plz let me know so I
don't waste my time =]

On Thu, Feb 4, 2016 at 6:37 PM, Eric van Gyzen  wrote:

> On 02/ 3/16 10:54 AM, Eric van Gyzen wrote:
>
>> I just set up a new desktop running head with x11/nvidia-driver.  I've
>> discovered a memory leak where pages disappear from the queues, never to
>> return.  Specifically, the total of
>>  v_active_count
>>  v_inactive_count
>>  v_wire_count
>>  v_cache_count
>>  v_free_count
>> drops, eventually becoming /much/ less than v_page_count.  After leaving
>> xscreensaver running overnight, cycling the saver every 10 minutes, the
>> system was unusable, because it only had a few MB of memory.  (It has 8
>> GB physical.)
>>
>
> In case anyone is curious, /usr/local/bin/xscreensaver-hacks/glmatrix
> triggers a fairly fast leak--around 600 pages per second.
>
> Eric
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel memory leak with x11/nvidia-driver

2016-02-03 Thread Ultima
Just tested your script, there is definitely a memory leak.

 I also ran into really weird behavior. Running your script
in tmux after starting and stopping an xorg session a few,
tmux completely froze in the session. Creating a new
window in the session was also completely frozen,
however this is only visually as commands still worked,
just showed a blank black screen.

Also unloading the kernel modules for
nvidia and nvidia-modeset (new as of 358.16ish) did
not free the memory.

On Wed, Feb 3, 2016 at 8:24 PM, Ultima <ultima1...@gmail.com> wrote:

>  Apologies, this should have been in my initial reply.
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201340
> or here for attachment
> https://bz-attachments.freebsd.org/attachment.cgi?id=165694
>
> I haven't actually had a chance to do anything after upgrading
> from stable other than see the corrupted console for myself.
> Lack of time =/
>
> On Wed, Feb 3, 2016 at 2:41 PM, Eric van Gyzen <vangy...@freebsd.org>
> wrote:
>
>> On 02/03/2016 10:54, Eric van Gyzen wrote:
>> > I just set up a new desktop running head with x11/nvidia-driver.  I've
>> > discovered a memory leak where pages disappear from the queues, never to
>> > return.  Specifically, the total of
>> > v_active_count
>> > v_inactive_count
>> > v_wire_count
>> > v_cache_count
>> > v_free_count
>> > drops, eventually becoming /much/ less than v_page_count.
>>
>> Here is a script to log the data:
>>
>> #!/bin/sh
>>
>> readonly QUEUES="active inactive wire cache free total"
>> readonly FORMAT="%s\t%s\t%s\t%s\t%s\t%s\n"
>>
>> vm_page_counts() {
>> for queue in $QUEUES; do
>> if [ "$queue" != "total" ]; then
>> sysctl -n vm.stats.vm.v_${queue}_count
>> fi
>> done
>> }
>>
>> sum() {
>> s=0
>> while [ $# -gt 0 ]; do
>> s=$((s + $1))
>> shift
>> done
>> echo $s
>> }
>>
>> print_counts() {
>> counts="`vm_page_counts`"
>> printf "$FORMAT" $counts `sum $counts`
>> }
>>
>> printf "$FORMAT" $QUEUES
>> print_counts
>> while sleep 60; do
>> print_counts
>> done
>>
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org
>> "
>>
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel memory leak with x11/nvidia-driver

2016-02-03 Thread Ultima
 Apologies, this should have been in my initial reply.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201340
or here for attachment
https://bz-attachments.freebsd.org/attachment.cgi?id=165694

I haven't actually had a chance to do anything after upgrading
from stable other than see the corrupted console for myself.
Lack of time =/

On Wed, Feb 3, 2016 at 2:41 PM, Eric van Gyzen  wrote:

> On 02/03/2016 10:54, Eric van Gyzen wrote:
> > I just set up a new desktop running head with x11/nvidia-driver.  I've
> > discovered a memory leak where pages disappear from the queues, never to
> > return.  Specifically, the total of
> > v_active_count
> > v_inactive_count
> > v_wire_count
> > v_cache_count
> > v_free_count
> > drops, eventually becoming /much/ less than v_page_count.
>
> Here is a script to log the data:
>
> #!/bin/sh
>
> readonly QUEUES="active inactive wire cache free total"
> readonly FORMAT="%s\t%s\t%s\t%s\t%s\t%s\n"
>
> vm_page_counts() {
> for queue in $QUEUES; do
> if [ "$queue" != "total" ]; then
> sysctl -n vm.stats.vm.v_${queue}_count
> fi
> done
> }
>
> sum() {
> s=0
> while [ $# -gt 0 ]; do
> s=$((s + $1))
> shift
> done
> echo $s
> }
>
> print_counts() {
> counts="`vm_page_counts`"
> printf "$FORMAT" $counts `sum $counts`
> }
>
> printf "$FORMAT" $QUEUES
> print_counts
> while sleep 60; do
> print_counts
> done
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r294195: Kernel panic during installworld

2016-01-29 Thread Ultima
 Just upgraded from 10-STABLE to head r295051 and I still have kernel
panics. I then tested with bsdinstall and found that the panics no longer
occur. I believe the issue is that some of my datasets are set to
recordsize=1M, this is most likely the root cause of these panics.

On Sun, Jan 17, 2016 at 2:59 PM, Ultima <ultima1...@gmail.com> wrote:

> https://puu.sh/mzmpL/b209da9263.jpgn  I hope this is better, sorry I
> didn't know it would get stripped.
>
> Ultima
>
> On Sun, Jan 17, 2016 at 2:58 PM, Ultima <ultima1...@gmail.com> wrote:
>
>> https://puu.sh/mzmpL/b209da9263.jpgn  I hope this is better, sorry I
>> didn't know it would get stripped.
>>
>> Ultima
>>
>> On Sun, Jan 17, 2016 at 12:59 PM, Allan Jude <allanj...@freebsd.org>
>> wrote:
>>
>>> On 2016-01-17 09:14, Ultima wrote:
>>> > make installkernel KERNCONF=MYKERNEL
>>> > reboot
>>> > mergemaster -p
>>> > make installworld -> attached
>>> >
>>> > MYKERNEL=GENERIC+
>>> > options VIMAGE
>>> > options ROUTETABLES=2
>>> >
>>> >  Anything else that maybe helpful?
>>> >
>>> > Ultima
>>> > ___
>>> > freebsd-current@freebsd.org mailing list
>>> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> > To unsubscribe, send any mail to "
>>> freebsd-current-unsubscr...@freebsd.org"
>>> >
>>>
>>> Your attachment was stripped. Can you post it somewhere and include the
>>> url?
>>>
>>> --
>>> Allan Jude
>>>
>>>
>>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r294195: Kernel panic during installworld

2016-01-17 Thread Ultima
https://puu.sh/mzmpL/b209da9263.jpgn  I hope this is better, sorry I didn't
know it would get stripped.

Ultima

On Sun, Jan 17, 2016 at 2:58 PM, Ultima <ultima1...@gmail.com> wrote:

> https://puu.sh/mzmpL/b209da9263.jpgn  I hope this is better, sorry I
> didn't know it would get stripped.
>
> Ultima
>
> On Sun, Jan 17, 2016 at 12:59 PM, Allan Jude <allanj...@freebsd.org>
> wrote:
>
>> On 2016-01-17 09:14, Ultima wrote:
>> > make installkernel KERNCONF=MYKERNEL
>> > reboot
>> > mergemaster -p
>> > make installworld -> attached
>> >
>> > MYKERNEL=GENERIC+
>> > options VIMAGE
>> > options ROUTETABLES=2
>> >
>> >  Anything else that maybe helpful?
>> >
>> > Ultima
>> > ___
>> > freebsd-current@freebsd.org mailing list
>> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> > To unsubscribe, send any mail to "
>> freebsd-current-unsubscr...@freebsd.org"
>> >
>>
>> Your attachment was stripped. Can you post it somewhere and include the
>> url?
>>
>> --
>> Allan Jude
>>
>>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


r294195: Kernel panic during installworld

2016-01-17 Thread Ultima
make installkernel KERNCONF=MYKERNEL
reboot
mergemaster -p
make installworld -> attached

MYKERNEL=GENERIC+
options VIMAGE
options ROUTETABLES=2

 Anything else that maybe helpful?

Ultima
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r294195: Kernel panic during installworld

2016-01-17 Thread Ultima
It builds fine, the issue is installing it. I think the panic is zfs
related not actually installworld.

On Sun, Jan 17, 2016 at 9:45 AM, Kurt Jaeger  wrote:

> Hi!
>
> > make installkernel KERNCONF=MYKERNEL
> > reboot
> > mergemaster -p
> > make installworld -> attached
> >
> > MYKERNEL=GENERIC+
> > options VIMAGE
> > options ROUTETABLES=2
> >
> >  Anything else that maybe helpful?
>
> I build a generic world at r294095, no problems.
>
> --
> p...@opsec.eu+49 171 3101372 4 years to
> go !
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"