Re: kernel: MCA: CPU 0 COR (1) internal parity error

2015-01-17 Thread Jeremy Chadwick
 be new MCEs or changes to the MCA that Intel
implemented in some newer models of Core iX that aren't being handled
correctly by the kernel (i.e. misreporting or mis-decoding).

Good luck!

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Any objections/comments on axing out old ATA stack?

2013-04-21 Thread Jeremy Chadwick
On Sun, Apr 21, 2013 at 02:11:04PM +0300, Alexander Motin wrote:
 On 21.04.2013 00:29, Jeremy Chadwick wrote:
 - The ATA commands which lead up to the error also vary.  Many are for
write requests, and from some entries I can see that the OS was doing
NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a
classic 28-bit LBA write (WRITE DMA).  I'm not sure why an OS would do
this (there's nothing optimal about it) unless there were conditions
occurring where the OS/ATA driver said this NCQ write isn't working
(timeout, etc.), let me retry with a classic 28-bit LBA write.
 
 ATA disk driver in CAM inserts non-queued command every several
 seconds of continuous load to limit possible command starvation
 inside the disk. SCSI driver does alike things, but inserts ordered
 command flag, that does not exist in SATA, instead of different
 command.

Thanks for the insights Alexander, greatly appreciated.

I'm a little confused by your description, because if I'm reading it
right, it sounds like it conflicts with what the ACS-2 spec states.
Quoting T13/2015-D rev 3 (I'm aware it's a working draft), section
4.16.1:

If the device receives a command that is not an NCQ command while NCQ
commands are in the queue, then the device shall return command aborted
for the new command and for all of the NCQ commands that are in the
queue.

I assume this means ABRT status is returned to the host controller; if
so (and by design of course), how do we differentiate between that
condition and any other I/O condition that induces ABRT?

Possibly in the answer is in this admission: I should probably get
around to reading ATA8-AST sometime.  :-)

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-20 Thread Jeremy Chadwick
On Thu, Apr 04, 2013 at 10:00:18AM +0200, Matthias Andree wrote:
 Am 04.04.2013 03:05, schrieb Jeremy Chadwick:
 
 { snipping stuff I have no comment on.  reference thread: }
 {  http://lists.freebsd.org/pipermail/freebsd-stable/2013-April/073036.html }
 
  One piece of evidence that refutes my theory is that if Windows and/or
  Linux partition are something you boot into and use often, I would
  imagine NCQ would be used in both of those environments and would suffer
  from the same issue.  Although Windows tends to hide all sorts of
  transient errors from the user (sigh), Linux tends to be like FreeBSD
  with regards to such issues (on the console anyway; you wouldn't see
  such messages normally inside of X).
 
 Now, the FreeBSD slice is the only partition on that disk that would
 likely see concurrent write accesses (think make -j8 on a quadcore
 computer) which is more prone to ferret out such alignment contention.
 
 The NTFS partition is aligned on a multi-MB boundary, so wouldn't hit
 the problem anyways.
 
 The Linux partition is in ext4 format for mostly sequential access to
 files usually in excess of 10 MB each.
 
 Linux's ext4 jumps through several hoops to end up with bulk writes,
 like extents, delayed allocations (to avoid fragmentation), reordering
 of data and metadata writes, serialized log writes and all that stuff,
 and it would appear I am permitting it to cache writes -- Linux uses
 write barriers to enforce proper ordering of journal/meta-data writes.
 
 It would be rather hard to hit ATA taskfile timeouts, the expected rate
 with which the drive needs to do a partial write is orders of magnitude
 lower.
 
 Any good concurrent write exercise tools for Unix that I could run on
 the Linux ext4 partition that you would propose?

The only tool I'm familiar with is bonnie++.

But I don't think this (partition alignment) is what matters now.  Your
smartctl output has shed some light on your situation.

  - I am running with kern.cam.ada.default_timeout=5 which makes the
  computer recover faster
  
  I can definitely imagine cases where a drive using NCQ but doing writes
  to a non-aligned partition could take longer than 5 seconds to respond
  to an ATA CDB (this is different than a SATA or AHCI layer timeout).  I am
  not telling you change this back to 30, but it might not be helping
  your situation at all given my above theory.
 
 My feeling is that the stalls are mostly from the error handler and the
 overall time the drive is frozen gets shorter. If it had not _felt_
 faster, I'd not have left that in sysctl.conf in the first place.

Your understanding of what that sysctl does is wrong, or I'm
misunderstanding what you're saying (very possible!).

How I interpret what you're saying: that the sysctl somehow decreases
stall times during I/O operations that fail.  This is incorrect.

What that sysctl does is define the number of seconds that transpire
***before*** the CAM layer says Okay, I didn't get a response to the
ATA CDB I sent the disk, and then re-submits the same CDB to the disk.

Rephrased: in the case of a disk stalling on an I/O request, you will
experience the effects of that stall no matter what that sysctl is set
to.  A lower value in that sysctl will result in CAM spitting out
nasties on the console + hitting the CDB retry submission scenario
sooner, which if the drive is awake/responsive by that time will go
smoothly.

That's all it does.

Thus a value of 5 indicates a device/drive did not respond to a CDB
within 5 seconds, and a value of 30 indicates a device/drive did not
respond to a CDB within 30 seconds.  Regardless, those lengths of time
are VERY long for an I/O operation on a mechanical HDD.

When you get to the bottom of my Email, you'll understand why I screamed
at you about adjusting that sysctl.

  Finally: could you please provide output from smartctl -x /dev/ada1?
  I would like to rule out any possibility of your drive having some other
  kind of issue that might cause it to go catatonic.  Thanks.
 
 I have fetched the data with Linux this time (should not make a
 difference as it's all drive internal data, not host OS stuff).
 
 Looks sane to me, http://people.freebsd.org/~mandree/smartctl.log.
 I'll be happy to refetch this data with a more current smartctl version
 under FreeBSD if required.

Oh look, it's the Samsung SpinPoint series, especially the EcoGreen
(EG) series.  No joke: ~60% of the problem reports I deal with when
it comes to weird wonky problems stem from this drive series.  I have
no idea why, but they're a common pain point for me.

First, about the shown sector size: smartmontools 5.41 was the first
release to show the sector sizes per ATA IDENTIFY.  I assume they got
this right from the get-go.  So as of this moment I'm going to assume
that this drive really is a 512-byte sector drive.

Politely, your analysis of the drive (looks sane to me) is an
indicator of why SMART output needs to be interpreted by a person who is
familiar

Re: Kernel output interleaved on boot

2013-04-08 Thread Jeremy Chadwick
I have discussed this problem for years now -- over 5 years, to be
exact.  As if I haven't sounded like a broken record before, I surely do
now.  Start here, under section Kernel, item Scrambled or garbled
kernel output:

https://wiki.freebsd.org/BugBusting/Commonly_reported_issues

The problem has not gone away.  It has not been solved.  It has not been
worked around.  PRINTF_BUFR_SIZE does not solve the problem, and rarely
helps relieve it.

I have discussed this issue more recently (2010) with John Baldwin as
well:

http://lists.freebsd.org/pipermail/freebsd-questions/2010-March/214412.html
http://lists.freebsd.org/pipermail/freebsd-questions/2010-March/214423.html

And in December 2011 too -- particularly an important read if you think
increasing the number is a wise idea:

http://lists.freebsd.org/pipermail/freebsd-stable/2011-December/065158.html

Bottom line: there is no solution other than to switch OSes.

And yes, I am aware of how GSoC works, but this really should have
become a GSoC project by now, otherwise the Foundation should have
funded someone to fix this.  It makes kernel debugging basically
worthless.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Jeremy Chadwick
On Thu, Apr 04, 2013 at 12:15:32AM +0200, Matthias Andree wrote:
 I have just sent more information to the PR at
 http://www.freebsd.org/cgi/query-pr.cgi?pr=157397
 
 The short summary (more info in the PR) is:
 
 - limiting tags to 31 does not help
 
 - disabling NCQ appears to help in initial testing, but warrants more
 testing
 
 - error happens during WRITE_FPDMA_QUEUED,

This is an NCQ-based write LBA request.  There are many non-NCQ
equivalents of this, ATA-protocol-wise (too many to list here), but the
most likely non-NCQ ATA command you'd see is WRITE_DMA48.

 - File system in question is SU+J UFS2 mounted on /usr, and I can for
 instance rm -rf /usr/obj or just log into GNOME and try to open a
 gnome-terminal to trigger stalls;
 
 - Linux uses 31 tags (for different reason) and has no drive quirks, but
 a controller quirk;
 
 for Jeremy's topic #6, regarding the ATI/AMD SB7x0 that I am using, it
 might be worthwhile investigating the AHCI_HFLAG_IGN_SERR_INTERNAL flag
 - it gets set by Linux on the SB700 that my computer is using, see
 ahci_error_intr() in libahci.h - I am not going to interpret that for
 lack of expertise, but it does affect error handling and appears to
 ignore a certain condition.

Alexander could expand on this, but the name of the flag implies that
there are certain conditions where the SATA-level SERR condition gets
ignored (IGN).

While skimming Linux libata code and commits in the past, the only
glaringly obvious bug/issue I see is with SB600/SB700 chipsets (the
hardware revision apparently matters) and port multiplier (PMP) support
and soft resets.

Are you using a port multiplier?  I doubt it, but I have to ask.

 Why only my Samsung HDD drive triggers this but not the WD drive, I do
 not know yet.

Please provide gpart show -p ada1 output, both here and in the PR,
if you could.

I have a gut feeling I know what the issue is (and if it is what I think
it is, it's actually happening all the time, just that NCQ exacerbates
it given how command queueing works), but I won't know for sure until I
see the output.

Thanks.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Jeremy Chadwick
On Thu, Apr 04, 2013 at 02:19:16AM +0200, Matthias Andree wrote:
 Am 04.04.2013 01:38, schrieb Jeremy Chadwick:
 
 ...
 
  While skimming Linux libata code and commits in the past, the only
  glaringly obvious bug/issue I see is with SB600/SB700 chipsets (the
  hardware revision apparently matters) and port multiplier (PMP) support
  and soft resets.
  
  Are you using a port multiplier?  I doubt it, but I have to ask.
 
 I am not using a PMP as far as I know (unless one is buried on my Asus
 M4A78T-E main board). It would seem the drives are directly attached to
 the south bridge's SATA ports.

Then the answer is nope, you're not using a PM.  Details:

http://www.serialata.org/technology/port_multipliers.asp
http://en.wikipedia.org/wiki/Port_multiplier

  Why only my Samsung HDD drive triggers this but not the WD drive, I do
  not know yet.
  
  Please provide gpart show -p ada1 output, both here and in the PR,
  if you could.
 
 =63  1953525105ada1  MBR  (931G)
   63   209714337  ada1s1  freebsd  [active]  (100G)
209714400 800  - free -  (400k)
2097152007168  ada1s2  ntfs  (34G)
281395200   15405  - free -  (7.5M)
281410605   488263545  ada1s3  linux-data  (232G)
769674150  1183851018  - free -  (564G)

This is what I was worried about.  Referring to your camcontrol
identify output:

 device model SAMSUNG HD103SI
 sector size logical 512, physical 512, offset 0

Hear me out entirely on this one.

My theory is that your hard disk actually uses 4096-byte sectors but is
too old to provide ATA IDENTIFY semantics to delineate between logical
vs. physical sector size.  In other words, only logical is provided,
thus logical=physical in the eyes of all software; smartctl will show
you the exact same thing too.

There are drives like this in the wild, both SSDs as well as MHDDs.
For example, the Intel 320-series SSD behaves this way too (providing
only logical size).

Do not let the capacity/size of the drive be the deciding factor; your
drive is 1TB, but I also have many 1TB MHDDs that use 4096-byte sectors.

Seagate/Samsung's specification** for the HD103SI states, and I quote:
Byte per Sensor: 512 bytes.  Yes, it says Sensor.  Whether or not
this documentation is correct/accurate is unknown, and when vendors have
typos in their own specification docs, I cannot help but to honour the
possibility of the information being wrong.  So I'm unsure if this drive
uses 512-byte sectors or 4096-byte sectors.

That said: in your gpart show ada1 output, none of your partitions
(FreeBSD, NTFS, nor Linux) appear to be aligned to 4096-byte boundaries.
Ideally you'd want to have these aligned to 1MB or 2MByte boundaries in
the case you ever move to an SSD.  You're also using the MBR scheme,
which does not tend to play well with alignment.

Comparatively, your WD5002ABYS drive **does** use 512-byte sectors (I
know this for a fact).

The problem here is that I cannot guarantee you that alignment is
the problem.  The performance impact of writes to partitions which are
non-aligned is quite high, and NCQ just exacerbates this problem.  I
would love to tell you switch to GPT and follow Warren Block's
document*** but if your NTFS partition is Windows and is a Windows version
older than Windows 7 GPT is not supported.

One piece of evidence that refutes my theory is that if Windows and/or
Linux partition are something you boot into and use often, I would
imagine NCQ would be used in both of those environments and would suffer
from the same issue.  Although Windows tends to hide all sorts of
transient errors from the user (sigh), Linux tends to be like FreeBSD
with regards to such issues (on the console anyway; you wouldn't see
such messages normally inside of X).

If you have the time and want to put forth the effort, I would recommend
backing up all your data on ada1, zero the first and last 1MByte of the
drive, and then try following Warren Block's guide.  I'd just recommend
doing this:

gpart create -s gpt ada1
gpart add -t freebsd-ufs -b 2m ada1
newfs -U -j /dev/ada1p1   (or remove -j if you don't want to use SUJ)

I picked an alignment value of 2MBytes since it's both 4K-aligned and is
generally safe for things like newer SSDs that have larger NAND erase
block size (I am not going to get into a discussion about that here, so
please stay focused.  :-) )

If the problem is gone after that (it should be easy to induce by
writing tons and tons of data to the drive), then we can safely say that
the drive uses 4096-byte sectors and need to add it to the quirks list
in ata_da.c.

If the problem remains after that, then further investigation is needed,
and we can safely rule out alignment.  Welcome to all the pain/effort
one has to go through when troubleshooting things like this.  :-)

Another thing: in your PR you state:

 - I am running with kern.cam.ada.default_timeout=5 which makes the
 computer recover faster

I can definitely imagine cases where

Re: Any objections/comments on axing out old ATA stack?

2013-03-31 Thread Jeremy Chadwick
On Sun, Mar 31, 2013 at 03:02:09PM -0600, Scott Long wrote:
 On Mar 31, 2013, at 7:04 AM, Victor Balada Diaz vic...@bsdes.net wrote:
  On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
  Hi.
  
  Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA 
  stack, using only some controller drivers of old ata(4) by having 
  `options ATA_CAM` enabled in all kernels by default. I have a wish to 
  drop non-ATA_CAM ata(4) code, unused since that time from the head 
  branch to allow further ATA code cleanup.
  
  Does any one here still uses legacy ATA stack (kernel explicitly built 
  without `options ATA_CAM`) for some reason, for example as workaround 
  for some regression? Does anybody have good ideas why we should not drop 
  it now?
  
  Hello,
  
  At my previous job we had troubles with NCQ on some controllers. It caused
  failures and silent data corruption. As old ata code didn't use NCQ we just 
  used
  it.
  
  I reported some of the problems on 8.2[1] but the problem existed with 8.3.
  
  I no longer have access to those systems, so i don't know if the problem
  still exists or have been fixed on newer versions.
 
 So what I hear you and Matthias saying, I believe, is that it should be 
 easier to
 force disks to fall back to non-NCQ mode, and/or have a more responsive
 black-list for problematic controllers.  Would this help the situation?  It's 
 hard to
 justify holding back overall forward progress because of some bad controllers;
 we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x,
 enough to make up a sizable percentage of the internet's traffic, and we see 
 no
 problems.  How can we move forward but also take care of you guys with
 problematic hardware?

I've read a referenced PR (157397) except there really isn't enough
technical troubleshooting/detail to determine what the root cause is.

That isn't the fault of the reporter either -- the reporter needs to be
told what information they need to provide / how to troubleshoot it.
Meaning: kernel folks who are in-the-know need to step up and help.

That PR is soon-to-be 2 years old and is missing tons of information
that, even as a non-kernel guy, that *I* would find useful:

1. Output from:
   - camcontrol tags ada1 -v
   - camcontrol identify ada1
   - What sorts of filesystems are on ada1; if UFS, tunefs -p output
 would be greatly appreciated
   - If the timeouts happen during heavy I/O load, and if so, during
 what kinds of I/O load (reads or writes).

2. Does camcontrol tags ada1 -N 31 help?  I mention this because
stated here:

http://lists.freebsd.org/pipermail/freebsd-stable/2013-March/072985.html

...there are statements which imply decreasing queue length may solve
the issue.  What confuses me, however, is that the queue length on my
own systems (with different models of disks, as well as an SSD) all have
a limit of 32.  I dug through the kernel source for a while but could
not easily find where this number comes from.  (I have very little
familiarity with command queuing at the protocol level)

3. Why not find out why Linux (probably libata) has a 32 (or 31?) queue
limit?  They have commit logs, and there is the LVKM where you could
ask.  While I understand reluctance to add something just because Linux
does it, it doesn't appear anyone's stepped up to the plate to ask them
why; I pray this is not caused by anti-Linux sentiment.

4. The ada1 device in the PR is a Samsung Spinpoint EcoGreen F2 hard
drive (1TB, 5400rpm, 32MB cache).  Possibly the drive has firmware bugs
relating to its NCQ implementation, or possibly it's going into some
power-saving mode (it is an EcoGreen model).  I've always been wary of
the EcoGreen disks since reading about the F4 EcoGreen firmware fiasco
(even though the same page says the F1 and F3 EcoGreen had no issue):

http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks

5. We really need to have some way to print active quirks for devices,
even if it's only at boot-up, e.g.:

ada3: quirks=0x00034K,NO_NCQ

I'd be happy to write the code for this (basing it on how we do CPU
flags), but as I've said in the past, kernel-land is scary to me.

6. The controller referenced is an ATI IXP700.  I cannot tell you how
many times on the mailing lists I've seen weird issues reported by
people using that controller.  I am in no way/shape/form saying the
issue is with the controller or with AHCI compatibility (FreeBSD vs.
ATI), because I have no proof.  I just find it very unnerving that so
many issues have been reported where that controller is involved, and
often across all sorts of different device/disk models.

All that said:

I agree a loader tunable to inhibit command queueing would be nice.
sysctl would be even more convenient (easier for real-time testing) but
I don't know the implications of turning CQ off in the middle of any
pending I/O requests.

-- 
| Jeremy Chadwick   j...@koitsu.org

Re: [HEADS UP] pkgng binary packages regression in 1.0.9. Fixed in 1.0.9_1

2013-03-20 Thread Jeremy Chadwick
On Wed, Mar 20, 2013 at 04:20:02PM +0100, Matthias Gamsjager wrote:
   Due to the security incident, there are still no official FreeBSD
  packages.
 
 Do you know what the status is on that issue?

I'd also like to find out what the status of this is.

The packages at:

ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-9-stable/

Are still circa October 2012 -- that's 4-5 months ago.

While I truly and deeply understand that proper engineering design and
infrastructure changes take time, there has been absolutely no
communication presented to the community as to what has (or hasn't)
transpired, if there is (or isn't) a plan, or if people are simply
waiting until future in-person BSD* events to work things out.
freebsd-ops-announce has been silent on this matter as well:

http://lists.freebsd.org/mailman/listinfo/freebsd-ops-announce

At this point users and administrators do not know if newer packages
will be made available or if they should stick to building purely from
source.

Deep down I'm worried that this will solicit a response of switch to
ports-mgmt/pkg and ports-mgmt/poudriere.  While I'm not opposed to the
tools themselves, I'm strongly opposed to that kind of response as I'm
tired of seeing the security incident being used as a opportunistic
crutch (as it was for the sudden cvsup/csup deprecation).

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ACPI broke going from 8 to 9

2011-12-31 Thread Jeremy Chadwick
On Sat, Dec 31, 2011 at 04:17:16PM -0700, Dan Allen wrote:
 On 31 Dec 2011, at 12:34 PM, Garrett Cooper wrote:
 
  Not yet. Add 'nooptions NEW_PCIB' to your KERNCONF, recompile, and
  try booting the new kernel. See if this works.
 
 It worked!  No hang, power button works.  Nice.  I hope this experimental 
 option stays in.
 
 Thank you everyone for your help.  Happy New Years!

This option isn't documented **anywhere** in the entire src tree.  It's
purely #ifdef all over.

The code in question was committed 7 months ago.  It was MFC'd to
RELENG_8 6 months ago.  Here's the HEAD commit message:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/pci/pci.c#rev1.420

The RELENG_8 MFC is revision 1.386.2.15.

The committer is jhb@, with mav@ being the individual who tested it, so
I imagine either of these folks will have some excellent insights as to
what's causing Dan's problem.  I'm CC'ing them both directly on this
thread.

In the meantime: Dan, when you say in your original mail, I just
upgraded my Dell OptiPlex GX270 from RELENG_8 to RELENG_9, can you
please provide uname -a output from the system when it was running
RELENG_8?  I'm looking specifically for the exact time when the kernel
was built, because there may have been fixes (that broke things for you)
between the above commit and present-day RELENG_8 (I have not examined
all commits).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Benchmark (Phoronix): FreeBSD 9.0-RC2 vs. Oracle Linux 6.1 Server

2011-12-23 Thread Jeremy Chadwick
On Fri, Dec 23, 2011 at 10:00:05AM -0500, John Baldwin wrote:
 On Thursday, December 22, 2011 6:58:46 pm Jeremy Chadwick wrote:
  On Fri, Dec 23, 2011 at 12:44:14AM +0100, O. Hartmann wrote:
   On 12/21/11 19:41, Alexander Leidinger wrote:
Hi,

while the discussion continued here, some work started at some other 
 place. Now... in case someone here is willing to help instead of talking, 
 feel 
 free to go to http://wiki.freebsd.org/BenchmarkAdvice and have a look what 
 can 
 be improved. The page is far from perfect and needs some additional people 
 which are willing to improve it.

This is only part of the problem. A tuning page in the wiki - which 
 could be referenced from the benchmark page - would be great too. Any 
 volunteers? A first step would be to take he tuning-man-page and wikify it. 
 Other tuning sources are welcome too.

Every FreeBSD dev with a wiki account can hand out write access to the 
 wiki. The benchmark page gives contributor-access. If someone wants write 
 access create a FirstnameLastname account and ask here for contributor-access.

Don't worry if you think your english is not good enough, even some one-
 word notes can help (and _my_ english got already corrected by other people 
 on 
 the benchmark page).

Bye,
Alexander.




   
   Nice to see movement ;-)
   
   But there seems something unclear:
   
   man make.conf(5) says, that  MALLOC_PRODUCTION is a knob set in
   /etc/make.conf.
   The WiJi says, MALLOC_PRODUCTION is to be set in /etc/src.conf.
   
   What's right and what's wrong now?
  
  I can say with certainty that this value belongs in /etc/make.conf
  (on RELENG_8 and earlier at least).
  
  src/share/mk/bsd.own.mk has no framework for MK_MALLOC_PRODUCTION,
  so, this is definitely a make.conf variable.
 
 Eh, normal make variables can go in src.conf as well.  They do not have
 to be listed in bsd.own.mk.  World builds include /etc/src.conf whereas
 every make invocation includes /etc/make.conf via sys.mk.  The only reason
 to use /etc/src.conf is to have a place to put variables only affect
 make buildworld / buildkernel but do not affect other make invocations.

I was always under the impression src.conf(5) variables had to be
manually added to bsd.own.mk and similar bits (e.g.
src/tools/build/options/WITH_xxx which is what's used to create the
src.conf(5) man page), but upon your comment and manual investigation on
my part, I found you're indeed right.  Taken from bsd.own.mk:

107 .if !defined(_WITHOUT_SRCCONF)
108 SRCCONF?=   /etc/src.conf
109 .if exists(${SRCCONF})
110 .include ${SRCCONF}
111 .endif
112 .endif

As long as third-party software doesn't depend on MALLOC_PRODUCTION for
something (I don't know why something would, but who knows; maybe
there's a third-party malloc implementation which might?), then putting
it in src.conf would be fine (src/lib/libc/stdlib files reference it).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Benchmark (Phoronix): FreeBSD 9.0-RC2 vs. Oracle Linux 6.1 Server

2011-12-22 Thread Jeremy Chadwick
On Fri, Dec 23, 2011 at 12:44:14AM +0100, O. Hartmann wrote:
 On 12/21/11 19:41, Alexander Leidinger wrote:
  Hi,
  
  while the discussion continued here, some work started at some other place. 
  Now... in case someone here is willing to help instead of talking, feel 
  free to go to http://wiki.freebsd.org/BenchmarkAdvice and have a look what 
  can be improved. The page is far from perfect and needs some additional 
  people which are willing to improve it.
  
  This is only part of the problem. A tuning page in the wiki - which could 
  be referenced from the benchmark page - would be great too. Any volunteers? 
  A first step would be to take he tuning-man-page and wikify it. Other 
  tuning sources are welcome too.
  
  Every FreeBSD dev with a wiki account can hand out write access to the 
  wiki. The benchmark page gives contributor-access. If someone wants write 
  access create a FirstnameLastname account and ask here for 
  contributor-access.
  
  Don't worry if you think your english is not good enough, even some 
  one-word notes can help (and _my_ english got already corrected by other 
  people on the benchmark page).
  
  Bye,
  Alexander.
  
  
  
  
 
 Nice to see movement ;-)
 
 But there seems something unclear:
 
 man make.conf(5) says, that  MALLOC_PRODUCTION is a knob set in
 /etc/make.conf.
 The WiJi says, MALLOC_PRODUCTION is to be set in /etc/src.conf.
 
 What's right and what's wrong now?

I can say with certainty that this value belongs in /etc/make.conf
(on RELENG_8 and earlier at least).

src/share/mk/bsd.own.mk has no framework for MK_MALLOC_PRODUCTION,
so, this is definitely a make.conf variable.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Benchmark (Phoronix): FreeBSD 9.0-RC2 vs. Oracle Linux 6.1 Server

2011-12-20 Thread Jeremy Chadwick
 tests, reboots, etc. -- hours of work -- and if I get
that wrong, it's wasted effort (thus wasted developer time).  I want to
get it right.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: can a wrong alignment cause a decrease in a hdd's life expectancy?

2011-12-19 Thread Jeremy Chadwick
On Mon, Dec 19, 2011 at 03:20:10PM -0800, Jeremy Chadwick wrote:
 On Mon, Dec 19, 2011 at 10:56:33PM +, Alexander Best wrote:
  On Mon Dec 19 11, Poul-Henning Kamp wrote:
   In message 20111219224700.ga75...@freebsd.org, Alexander Best writes:
   On Mon Dec 19 11, Poul-Henning Kamp wrote:
In message 20111219221617.ga70...@freebsd.org, Alexander Best writes:

ps: the hdd only gets mounted read-only!

There is no known wear-effects in flash storage as long as you
only read.

You may need to do refresh-writes every 5-10 years to avoid
tunnel-leakage bit errors, but most flash controllers use semi-long
ECC syndromes and will do so on first bit that gives an read error.
   
   this is a regular hdd i believe -- no ssd. at least when i plug it into 
   my
   usb drive i hear the hdd spinning up and causing vibrations. i don't 
   think
   that would be the case with an ssd.
   
   Ahh, sorry, I don't know why I thought it was flash.
  
  no problem. so will the improper alignment also not cause a life expectancy
  shortage in case of a hdd (non-flash-based)?
 
 The improper alignment will result in sub-par write performance, and a
 slight decrease in read performance writes -- but will not impact life
 expectancy or harm the drive in any way.

This should have read ...slight decrease in read performance, not
read performance writes.  Editing mistake on my part.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: can a wrong alignment cause a decrease in a hdd's life expectancy?

2011-12-19 Thread Jeremy Chadwick
On Mon, Dec 19, 2011 at 10:56:33PM +, Alexander Best wrote:
 On Mon Dec 19 11, Poul-Henning Kamp wrote:
  In message 20111219224700.ga75...@freebsd.org, Alexander Best writes:
  On Mon Dec 19 11, Poul-Henning Kamp wrote:
   In message 20111219221617.ga70...@freebsd.org, Alexander Best writes:
   
   ps: the hdd only gets mounted read-only!
   
   There is no known wear-effects in flash storage as long as you
   only read.
   
   You may need to do refresh-writes every 5-10 years to avoid
   tunnel-leakage bit errors, but most flash controllers use semi-long
   ECC syndromes and will do so on first bit that gives an read error.
  
  this is a regular hdd i believe -- no ssd. at least when i plug it into my
  usb drive i hear the hdd spinning up and causing vibrations. i don't think
  that would be the case with an ssd.
  
  Ahh, sorry, I don't know why I thought it was flash.
 
 no problem. so will the improper alignment also not cause a life expectancy
 shortage in case of a hdd (non-flash-based)?

The improper alignment will result in sub-par write performance, and a
slight decrease in read performance writes -- but will not impact life
expectancy or harm the drive in any way.

I recommend strongly that you rectify the situation before you get too
carried away with software installations, etc..

And yes I am aware what you have is a mechanical HDD not an SSD (I say
in this advance of what I'm about to write).

If you need a safe alignment value, most software on Windows
(including Windows 7) pick a value of 2MBytes as the alignment offset,
which I believe is LBA 4095, since everything software-wise uses
512-byte sectors.  That's calculated via: 2097152 / 512.

This number is also evenly divisible by 4096 bytes (which is what you're
trying to ensure for performance).

Readers, as well as you, may wonder where the magical 2MByte value
comes from, and can you pick something smaller.  Yes you can pick
something smaller, but the value itself stems from the added complexity
of SSDs and NAND erase page size vs. NAND page size.  A value of 2MBytes
works well on all brands of SSDs on the market (as of this writing).

Which reminds me -- I need to go back and redo most of our systems that
use Intel SSDs, since at the time I picked the default offset in
sysinstall (LBA 63, thus 64 * 512 = 32KBytes), which though divisible by
4096, is not optimal for NAND erase page size.

I would love to advocate FreeBSD change sysinstall/bsdinstall to use a
default offset of 2MBytes, but I imagine that would upset a lot of
people who install FreeBSD on limited space devices (CF, etc.).
Honestly though, with the size of media these days

 and one other question: the hdd also supports usb 3. will the improper
 alignment have any effect (speed wise) when connected via usb 3, or is even
 usb 3 too slow to notice the performance drop due to the improper alignment?

USB 3.0 vs. 2.0 vs. eSATA vs. native SATA has no bearing on the
situation.  Those are transport protocols that define maximum
bandwidth.

By the way, the hard disk itself does not support USB 3.0 -- your
drive is in an enclosure that contains a SATA-USB3.0 conversion
chipset inside.  If you open the enclosure, you will find the hard disk
is SATA, and probably supports SATA600.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Benchmark (Phoronix): FreeBSD 9.0-RC2 vs. Oracle Linux 6.1 Server

2011-12-18 Thread Jeremy Chadwick
/universities; the first book is
basically a beginner's guide to CPU architecture.  The book is also a
bit old at that.  Individual proceeded to look up where the article
author went to school, and noted that said school's CPU architecture
course **ends** with that book.

The user/viewer demographic of overclockers.com is going to be
significantly different from that of phoronix.com -- you know that I'm
sure.  The point is that you should be aware that there is going to
be significant discussions that come from publishing such benchmark
comparisons with such a demographic.  Things that indicate severe
performance differential (e.g. 10x to 100x worse) are going to be
focused on and criticised -- and hopefully in a socially-agreeable
manner[1] -- and in a much different way than, say, a 3D video card
review site (lol ur pc sux if u spend onl $4000 on it lol).

The first step is to try and figure out what exactly you're seeing and
why it's so significantly different when compared to other OSes.

[1]: I'm sure by now you know that the BSDs in general tend to harbour a
community of folks who are more argumentative/aggressive than, say,
Linux (generally speaking).  In this thread though, I think all of us
really want to assist in some way to figure out what exactly is going on
here, scheduler-wise, and see if we can put something together to hand
developers who are responsible for said code and see what comes of it.
Remember, we're all here to try and make things better... I hope.  :-)

Footnote: It's nice meeting you (indirectly), I was always curious who
did the phoronix.com reviews/stuff when it came to FreeBSD.
Greetings!

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Benchmark (Phoronix): FreeBSD 9.0-RC2 vs. Oracle Linux 6.1 Server

2011-12-18 Thread Jeremy Chadwick
On Thu, Dec 15, 2011 at 05:32:47AM -0700, Samuel J. Greear wrote:
  Well, the only way it's going to get fixed is if someone sits down,
  replicates it, and starts to document exactly what it is that these
  benchmarks are/aren't doing.
 
 
 I think you will find that investigation is largely a waste of time,
 because not only are some of these benchmarks just downright silly,
 there are huge differences in the environments (compiler versions),
 etc., etc. leading to a largely apples/oranges comparison. But also
 the the analysis and reporting of the results by Phoronix is simply
 moronic to the point of being worse than useful, they are spreading
 misinformation.
 
 Take the first test as an example, Blogbench read. This doesn't raise
 any red flags, right? At least not until you realize that Blogbench
 isn't a read test, it's a read/write test. So what they have done here
 is run a read/write test and then thrown away the write results for
 both platforms and reported only the read results. If you dig down
 into the actual results,
 http://openbenchmarking.org/result/1112113-AR-ORACLELIN37 -- you will
 see two Blogbench numbers, one for read and another for write. These
 were both taken from the same Blogbench run, so FreeBSD optimizes
 writes over reads, that's probably a good thing for your data but a
 bad thing when someone totally misrepresents benchmark results.
 
 Other benchmarks in the Phoronix suite and their representations are
 similarly flawed, _ALL_ of these results should be ignored and no time
 should be wasted by any FreeBSD committer further evaluating this
 garbage. (Yes, I have been down this rabbit hole).

For sake of argument, let's say we throw out the Phoronix benchmarks as
a data source (I don't think the benchmark specifically implied or
stated this is all because of SCHED_ULE though; remember, that's what
we're supposed to be focusing on.  There may not be a direct correlation
between the Phoronix benchmarks and the ULE issue reported here...).
That said: thrown out, data ignored, done.

Now what?  Where are we?  We're right back where we were a day or two
ago; meaning no closer to solving the dilemma reported by users and
SCHED_ULE.  Heck, we're not even sure if there is an issue, other than
some folks confirming that SCHED_4BSD performs better for them (that's
what started this whole thread), and there are at least a couple which
have stated this.

So given the above semi-devil's-advocate response -- Sam, do you have
something positive or progressive to offer so we can move forward on the
ULE vs. 4BSD debacle?  :-)  The smiley is meant to be sincere, not
sarcastic.

I'm getting to the point where I'm considering formulating a private
mail to Jeff Roberson, requesting that he be aware of the discussion
that's happening (not that he necessarily follow or read it), and that
based on what I can tell we're at a roadblock -- nobody so far is
absolutely certain how to benchmark and compare ULE vs. 4BSD in
multiple ways, so that those of us involved here can run such utilities
and provide the data somewhere central for devs to review.  I only
mention this because so far I haven't seen anyone really say okay, this
is what we should be using for these kinds of tests.  Yay nature of the
beast.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SCHED_ULE should not be the default

2011-12-18 Thread Jeremy Chadwick
On Thu, Dec 15, 2011 at 05:26:27PM +0100, Attilio Rao wrote:
 2011/12/13 Jeremy Chadwick free...@jdc.parodius.com:
  On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
   Not fully right, boinc defaults to run on idprio 31 so this isn't an
   issue. And yes, there are cases where SCHED_ULE shows much better
   performance then SCHED_4BSD. ??[...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 
  This is in no way shape or form the same kind of benchmark as what
  you're planning to do, but I thought I'd throw it out there for folks to
  take in as they see fit.
 
  I know folks were focused mainly on buildworld.
 
  I personally would find it interesting if someone with a higher-end
  system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the
  same test (changing -jX to -j{numofcores} of course).
 
  --
  | Jeremy Chadwick ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??jdc at 
  parodius.com |
  | Parodius Networking ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 
  http://www.parodius.com/ |
  | UNIX Systems Administrator ?? ?? ?? ?? ?? ?? ?? ?? ?? Mountain View, CA, 
  US |
  | Making life hard for others since 1977. ?? ?? ?? ?? ?? ?? ?? PGP 4BD6C0CB 
  |
 
 
  sched_ule
  ===
  - time make -j2 buildworld
  ??1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w
  - time make -j2 buildkernel
  ??640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w
 
 
  sched_4bsd
  
  - time make -j2 buildworld
  ??1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w
  - time make -j2 buildkernel
  ??638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w
 
 
  software
  ==
  * sched_ule test: ??FreeBSD 8.2-STABLE, Thu Dec ??1 04:37:29 PST 2011
  * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011
 
 Hi Jeremy,
 thanks for the time you spent on this.
 
 However, I wanted to ask/let you note 3 things:
 1) Did you use 2 different code base for the test? (one updated on
 December 1 and another one on December 12)

No; src-all (/usr/src on this system) was not updated between December
1st and December 12th PST.  I do believe I updated it today (15th PST).
I can/will obviously hold off so that we have a consistent code base for
comparing numbers between schedulers during buildworld and/or
buildkernel.

 2) Please note that you should have repeated this test several times
 (basically until you don't get a standard deviation which is
 acceptable with ministat) and report the ministat output

This is the first time I have heard of ministat(1).  I'm pretty sure I
see what it's for and how it applies to this situation, but boy that man
page could use some clarification (I have 3 people looking at this thing
right now trying to figure out what means what in the graph :-) ).
Anyway, graph or not, I see the point.

Regarding multiple tests: yup, you're absolutely right, the only way to
do it would be to run a sequence of tests repeatedly (probably 10 per
scheduler).  Reboots and rm -fr /usr/obj/* would be required after each
test too, to guarantee empty kernel caches (of all types) consistently
every time.

What I posted was supposed to give people just a general idea if there
was any gigantic difference between the two, and there really isn't.
But, as others have stated (and you below), buildworld may not be an
effective way to benchmark what we're trying to test.

Hence me wondering exactly what would make for a good test.  Example:

1. Run + background some program that beats on things (I really don't
know what; creation/deletion of threads?  CPU benchmark?  bonnie++?),
with output going to /dev/null.
2. Run + background time make -j2 buildworld with output going to /dev/null
3. Record/save output from time.
4. rm -fr /usr/obj  shutdown -r now
5. Repeat all steps ~10 times
6. Adjust kernel configuration file to use other scheduler
7. Repeat steps 1-5.

What I'm trying to figure out is what #1 and #2 should be in the above
example.

 3) The difference is less than 2% which I suspect is really

Re: SCHED_ULE should not be the default

2011-12-13 Thread Jeremy Chadwick
On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD.  [...]
 
 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.
 
 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.

This is in no way shape or form the same kind of benchmark as what
you're planning to do, but I thought I'd throw it out there for folks to
take in as they see fit.

I know folks were focused mainly on buildworld.

I personally would find it interesting if someone with a higher-end
system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the
same test (changing -jX to -j{numofcores} of course).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |


sched_ule
===
- time make -j2 buildworld
  1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w
- time make -j2 buildkernel
  640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w


sched_4bsd

- time make -j2 buildworld
  1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w
- time make -j2 buildkernel
  638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w


software
==
* sched_ule test:  FreeBSD 8.2-STABLE, Thu Dec  1 04:37:29 PST 2011
* sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011


hardware
==
* Intel Core 2 Duo E8400, 3GHz
* Supermicro X7SBA
* 8GB ECC RAM (4x2GB), DDR2-800
* Intel 320-series SSD, 80GB: /, swap, /var, /tmp, /usr


tuning adjustments / etc.
===
* Before each scheduler test, system was rebooted to ensure I/O cache
  and other whatnots were empty
* All filesystems stock UFS2 + SU (root is non-SU)
* All filesystems had tunefs -t enable applied to them
* powerd(8) in use, with two rc.conf variables (per CPU spec):

performance_cx_lowest=C2
economy_cx_lowest=C2

* loader.conf

kern.maxdsiz=2560M
kern.dfldsiz=2560M
kern.maxssiz=256M
ahci_load=yes
hint.p4tcc.0.disabled=1
hint.acpi_throttle.0.disabled=1
vfs.zfs.arc_max=5120M

* make.conf

CPUTYPE?=core2

* src.conf

WITHOUT_INET6=true
WITHOUT_IPFILTER=true
WITHOUT_LIB32=true
WITHOUT_KERBEROS=true
WITHOUT_PAM_SUPPORT=true
WITHOUT_PROFILE=true
WITHOUT_SENDMAIL=true

* kernel configuration
  - note: between kernel builds, config was changed to either use
SCHED_4BSD or SCHED_ULE respectively.

cpu HAMMER
ident   GENERIC

makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols

options SCHED_4BSD  # Classic BSD scheduler
#optionsSCHED_ULE   # ULE scheduler
options PREEMPTION  # Enable kernel thread preemption
options INET# InterNETworking
options FFS # Berkeley Fast Filesystem
options SOFTUPDATES # Enable FFS soft updates support
options UFS_ACL # Support for access control lists
options UFS_DIRHASH # Improve performance on big directories
options UFS_GJOURNAL# Enable gjournal-based UFS journaling
options MD_ROOT # MD is a potential root device
options NFSCLIENT   # Network Filesystem Client
options NFSSERVER   # Network Filesystem Server
options NFSLOCKD# Network Lock Manager
options NFS_ROOT# NFS usable as /, requires NFSCLIENT
options MSDOSFS # MSDOS Filesystem
options CD9660  # ISO 9660 Filesystem
options PROCFS  # Process filesystem (requires PSEUDOFS)
options PSEUDOFS# Pseudo-filesystem framework
options GEOM_PART_GPT   # GUID Partition Tables.
options

Re: SCHED_ULE should not be the default

2011-12-13 Thread Jeremy Chadwick
On Tue, Dec 13, 2011 at 12:13:42PM +0100, O. Hartmann wrote:
 On 12/12/11 16:13, Vincent Hoffman wrote:
  
  On 12/12/2011 13:47, O. Hartmann wrote:
  
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD. [...]
  
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
  It all a little old now but some if the stuff in
  http://people.freebsd.org/~kris/scaling/
  covers improvements that were seen.
  
  http://jeffr-tech.livejournal.com/5705.html
  shows a little too, reading though Jeffs blog is worth it as it has some
  interesting stuff on SHED_ULE.
  
  I thought there were some more benchmarks floating round but cant find
  any with a quick google.
  
  
  Vince
  
  
 
 Interesting, there seems to be a much more performant scheduler in 7.0,
 called SCHED_SMP. I have some faint recalls on that ... where is this
 beast gone?

Boy I sure hope I remember this right.  I strongly urge others to
correct me where I'm wrong; thanks in advance!

The classic scheduler, SCHED_4BSD, was implemented back before there was
oxygen.  sched_4bsd(4) mentions this.  No need to discuss it.

Jeff Robertson began working on the first-generation ULE scheduler
during the days of FreeBSD 5.x (I believe 5.1), and a paper on it was
presented at USENIX circa 2003:
http://www.usenix.org/event/bsdcon03/tech/full_papers/roberson/roberson.pdf

Over the following years, Jeff (and others I assume -- maybe folks like
George Neville-Neil and/or Kirk McKusick?) adjusted and tinkered with
some of the semantics and models/methods.  If I remember right, some of
these quirks/fixes were committed.  All of this was happening under the
scheduler that was then called SCHED_ULE, but it was ULE 1.0 for lack
of better terminology.

This scheduler did not perform well, if I remember right, and Jeff was
quite honest about that.  From this point forward, Jeff began idealising
and working on a scheduler which he called SCHED_SMP -- think of it as
ULE 2.0, again, for lack of better terminology.  It was different than
the existing SCHED_ULE scheduler, hence a different name.  Jeff blogged
about this in early 2007, using exactly that term (ULE 2.0):
http://jeffr-tech.livejournal.com/3729.html

In mid-2007, prior to FreeBSD 7.0-RELEASE, Jeff announced that
effectively he wanted to make SCHED_ULE do what SCHED_SMP did, and
provided a patch to SCHED_ULE to accomplish just that:
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-07/msg00755.html

Full thread is here (beware -- many replies):
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-07/threads.html#00755

The patch mentioned above was merged into HEAD on 2007/07/19.
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_ule.c#rev1.202

So in effect, as of 2007/07/19, SCHED_ULE became SCHED_SMP.

FreeBSD 7.0-RELEASE was released on 2008/02/27, and the above
commit/changes were available at that time as well (meaning: RELENG_7
and RELENG_7_0 at that moment in time should have included the patch
from the above paragraph).

The document released by Kris Kenneway hinted at those changes and
performance improvements:
http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf

Keep in mind, however, that at that time kernel configuration files
(GENERIC, etc.) still defaulted to SCHED_4BSD.

The default scheduler in kernel config files (GENERIC, etc.) for i386
and amd64 (not sure about others) was changed in 2007/10/19:
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/conf/GENERIC#rev1.475
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/amd64/conf/GENERIC#rev1.485

This was done *prior* to FreeBSD 7.1-RELEASE.  So, it first became
available as the default scheduler for the masses when 7.1-RELEASE
came out on 2009/01/05.

All of the answers, in a roundabout and non-user-friendly way, are
available by examining the commit history for src/sys/kern/sched_ule.c.
It's hard to follow especially given that you have to consider all
the releases/branchpoints that took place over time, but:
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_ule.c

Are we having fun yet?  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail

Re: zfs i/o hangs on 9-PRERELEASE

2011-11-26 Thread Jeremy Chadwick
On Sat, Nov 26, 2011 at 04:47:35PM -0600, Mark Felder wrote:
 It appears that I'm mistaken about those messages then . However this does 
 both happen on my AMD x6 and Intel Atom machines with different hard drives, 
 controllers, etc. I feel it would be unlikely to be hardware. 
 
 Unfortunately the procstat command is probably of no use because I can't 
 interact with the console or ssh for the periods of time when it is hanging 
 (sometimes in excess of a minute). Zpool scrubs come up clean and I never see 
 any errors reported. I've been running this hardware for 2 years and v28 for 
 quite some time. It doesn't seem like it started happening until I upgraded 
 to a build past RC1. I don't know where to find RC1 media and I don't know 
 the svn revision of RC1 so I haven't tried.

The kernel backtrace you provided indicates a problem in pf(4), not ZFS.
What piece am I missing?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SIOCGIFADDR broken on 9.0-RC1?

2011-11-15 Thread Jeremy Chadwick
On Tue, Nov 15, 2011 at 11:35:37PM +0100, GR wrote:
 From Kristof Provost kris...@sigsegv.be:
 [..]
  The 'ia' pointer is later used to return the IP address.
  
  In other words: it returns the first address on the interface
  of type IF_INET (which isn't assigned to a jail).
  
  I think the order of the addresses is not fixed, or rather it depends
  on
  the order in which you assign addresses. In the handling of
  SIOCSIFADDR
  new addresses are just appended:
  
  TAILQ_INSERT_TAIL(ifp-if_addrhead, ifa, ifa_link);
  
  I don't believe this has changed since 8.0. Is it possible something
  changed in the network initialisation, leading to the addresses being
  assigned in a different order?
  
  Eagerly awaiting to be told I'm wrong,
  Kristof
 
 Thanks Kristof. It appears you are right, the order of assignement is 
 important.
 I configured my interface using DHCP, and added aliases (all in /etc/rc.conf).
 But on the 8.2-RELEASE, I used static configuration.
 
 So, I switched to static assignement and it changes the behaviour (and 
 fixes the bug).
 My guess is that during the time waiting for the DHCP offer, all aliases are 
 already configured on the network interface, and the IP address given by DHCP 
 is added at the end of the tail.
 
 Is that a wanted behaviour? I find it dangerous (i.e. not exactly what a user 
 is expecting).
 
 Note: my aliases are attributed to jails.

I would recommend adding synchronous_dhclient=yes to /etc/rc.conf.
This will cause dhclient (the DHCP client) to wait until it gets an
answer + IP back from the DHCP server before continuing with the rc.d
scripts.  The default is no.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: FreeBSD 10.0-CURRENT/amd64: Weirdness with LOCALE settings: ghostswitching in csh?

2011-11-04 Thread Jeremy Chadwick
On Fri, Nov 04, 2011 at 07:49:52AM +0100, O. Hartmann wrote:
 Am 11/03/11 23:48, schrieb Jeremy Chadwick:
  On Thu, Nov 03, 2011 at 11:17:08PM +0100, O. Hartmann wrote:
  Hello.
  I realised something weird in FreeBSD 10.-CURRENT/amd64 (CLANG
  compiled), build as from today (buildworld).
 
  Working the whole day coding some pyhton scripts and committing the code
  to my subversion server (most recent subversion from the ports
  collection, the server is a FreeBSD 9.0-RC1/amd64 box, also system
  compiled with CLANG, most recent as compiled world of today), suddenly,
  oy of the blue, trying again to commit I get this error:
 
  svn: warning: cannot set LC_CTYPE locale
  svn: warning: environment variable LC_CTYPE is de_DE.ISO-8859-1
  svn: warning: please check that your locale name is correct
 
 
  Checking  csh shell setting with 'locale:
  LANG=
  LC_CTYPE=C
  LC_COLLATE=C
  LC_TIME=C
  LC_NUMERIC=C
  LC_MONETARY=C
  LC_MESSAGES=C
  LC_ALL=
 
 
  Checking my settings from /etc/csh.cshrc and ./.cshrc or .login reveals
  localised settings for some of the locales as I need those:
 
  (set in $HOME/.cshrc)
  setenv  LC_CTYPEde_DE.ISO-8859-1
  setenv  LC_TIME de_DE.ISO-8859-1
  setenv  LC_MONETARY de_DE.ISO-8859-1
 
  What is going on?
 
  I realised this behaviour now several times, first time I thought I did
  something and I couldn't remember, but this time, only two terminal
  windows were opened and the whole day committing data to the repository
  wasn't an issue.
 
  Is there an explanation for this?
  
  It sounds like a problem specific to the client end, meaning your
  -CURRENT box.  If that's the case: shouldn't this mail have gone to
  freebsd-current@ instead of freebsd-stable@ ?  What am I missing?
 
 Mea culpa, mea culpa, mea maxima culpa!
 
 It was intented to send the mail to CURRENT. Sorry, missed the listentry
 by one row ... Can you please so kind and show mercy?

No worries.  I wasn't sure if there was a reason -stable was
involved; I saw it and thought Hmm, he mentions a 9.0-RC1/amd64 box,
maybe that's where the problem is?  I must be missing something, so I
thought I'd ask.  Mistakes happen, especially ones from me!  :-)

  As for your problem: your locale looks incorrect.  It's
  de_DE.ISO8859-1.  Note that yours has an extra hyphen, which probably
  explains the error (sort of).
  
  $ ls -ld /usr/share/locale/de_DE*
  drwxr-xr-x2 root  wheel 512 Sep 28 14:36 
  /usr/share/locale/de_DE.ISO8859-1/
  drwxr-xr-x2 root  wheel 512 Sep 28 14:36 
  /usr/share/locale/de_DE.ISO8859-15/
  drwxr-xr-x2 root  wheel 512 Sep 28 14:36 
  /usr/share/locale/de_DE.UTF-8/
  
  As for the fact that it's random: I cannot explain why a sub-shell
  might get spawned in some cases but not others.
 
 I corrected this. Sorry. I ffel a bit confused, since sometimes it is
 ISO-8859-1 and sometimes ISO8859-1. I got confused again.
 
 After correcting that, the locale variables has been set correctly.
 
 I will check now wether this also influences this weird random behaviour.

I find the randomness of the situation more perplexing than fixing
your locale (I have a feeling you do too).  I imagine this will probably
fix the errors you were seeing, but I'm still surprised that the errors
would happen seemingly intermittently.

I wonder how one could go about debugging such a thing.  Hmm.  Are there
any environment complexities you might have, such as using GNU screen or
tmux?  I'm cycling through ideas as they come to me.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: can audio CDs be played with ATA_CAM ?

2011-10-25 Thread Jeremy Chadwick
On Tue, Oct 25, 2011 at 01:18:47PM +0200, Claude Buisson wrote:
 On 10/25/2011 12:52, Daniel O'Connor wrote:
 
 On 25/10/2011, at 20:45, Claude Buisson wrote:
 When upgrading a system to 8.2-STABLE, I switched my kernel from atapicam to
 ATA_CAM, and found that vlc could not play audio CDs anymore. Reverting to
 atapicam (and reverting from cdN to acdN of course), vlc was OK again.
 
 It seems that I am not the only one having this kind of problem, as I found 
 (for
 example) this message on questions@ (for releng9):
 
 http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234737.html
 
 Is this a known problem ? Is somebody working on it ?
 
 Have you tried pointing VLC at /dev/cd0 when using ATA_CAM?
 
 
 Of course yes ! (I even configured WITH_CDROM_DEVICE=/dev/cd1 when building 
 VLC)
 
 
 It may be trying old style ATA ioctls based on the device name.
 
 
 VLC recognize the tracks and jump quickly from one to the following, without
 playing it, and with a flow of messages:
 
 [0x2caf2a3c] cdda access error: Could not set block size
 [0x2caf2a3c] cdda access error: cannot read sector n
 
 where the sector number is incremented, and then emit (2 times if I remenber):
 
 [0x2af28bc] es demux error: cannot peek
 
 Sorry for having ommited these messages in the previous mail.
 
 I found a PR 161760 about cdparanoia needing to be patched for 9.0 with CAM, a
 proposal by avg@ related to libxine:
 
 http://lists.freebsd.org/pipermail/freebsd-multimedia/2010-December/011414.html
 
 These may not be the same problem, but I think they are related (a not so well
 documented change in the kerm interface).

You want atapicam(4).  This is not the same thing as options ATA_CAM.
See /sys/conf/NOTES.

Whether or not it works with audio CDs is unknown to me.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: avl_find() panic

2011-07-05 Thread Jeremy Chadwick
On Wed, Jul 06, 2011 at 12:21:55AM +, John wrote:
I have a system that panic'd this morning, 4 day old current
 (2011-07-01_11.45pm). Message typed in from the console immediately
 after reboot. OS on ufs, data volumes on zfs.
 
 ZFS filesystem version 5
 ZFS storage pool version 28
 panic: avl_find() succedded inside avl_find()
 
Unfortunately, I don't have a traceback for this.
 
The comment in avl.c makes it seem like the avl code is enforcing
 uniqueness in calling code, esp. where it talks about kernel vs
 userland.
 
I'll followup with more info if this replicates.

Cross-posting is generally shunned, but since this is a current thing,
adding freebsd-current to the CC list.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: snd_hda : sometimes sound sometimes not

2011-05-28 Thread Jeremy Chadwick
On Sat, May 28, 2011 at 03:30:26PM +0200, David Demelier wrote:
 On 12/05/2011 08:47, David Demelier wrote:
 Hello,
 
 I don't know if there is a lot of changes in the snd_hda driver in the
 -STABLE branch but since I upgraded to it sometimes I have sound and
 sometimes not.
 
 The mixer are exactly the same when these event occurs. This happened
 this morning. After booting I do not have any sound. I rebooted and
 suddenly I've got sound again...
 
 I only tweak snd_hda(4) for a pin sense on the front panel (it has no
 sound neither)
 
 So I added in /boot/devices.hints :
 hint.hdac.1.cad0.nid27.config=as=1 seq=15
 
 And there's the both dmesg ok.txt when sound is here and not.txt when
 there isn't as you can see there is no difference related to the hda
 driver.
 
 http://markand.malikania.fr/ok.txt
 http://markand.malikania.fr/nok.txt
 
 I'm guessing something. My laptop has a mute shortcut, if I press it at
 the BIOS stage I will not have sound neither thus is it possible that my
 chipset is muted from anything?
 
 Cheers,
 
 
 Sorry to cross-post again, but I just wanted to tell you that the
 problem disappeared in -CURRENT so now I just how the unknown bogus
 code will be MFC before 8.3-RELEASE

Unless someone can chime in with details of the commits which changed,
assuming the magic change will be MFC'd is a bad one.  It's safe to
say that when 8.3-RELEASE comes out if this problem haunts you again,
you will be mailing the list about it, and this cycle will continue
until 9.0-RELEASE comes out.

Does any developer/committer have familiarity with this issue and have
some ideas as to what may have changed in CURRENT that addresses David's
issue?  And if so, can that code be MFC'd safely or patches provided to
David for RELENG_8 that he can try out?

I'm CC'ing mav@ here (snd_hda(4) says he's one of the authors), although
he may not have any knowledge of the code which may need to be MFC'd.
He may be able to point us to who has a better idea though.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!)

2011-03-06 Thread Jeremy Chadwick
On Sun, Mar 06, 2011 at 09:43:34AM -0500, Steve Wills wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 03/06/11 08:35, Steve Wills wrote:
  On 03/06/11 04:22, Edward Tomasz NapieraBa wrote:
  Wiadomo[ napisana przez Steve Wills w dniu 2011-03-06, o godz. 05:11:
  
  [..]
  
  Thanks for your work on this, I'm very happy to have ZFS v28. I just
  updated my -CURRENT system from a snapshot from about a month ago to
  code from today. I have 3 pools and one of them is for ports tinderbox.
  I only upgraded that pool. When I try to build something using
  tinderbox, I get this error:
 
  cp: failed to set acl entries for
  /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/buildscript: Operation not
  supported
  
  What does mount show?
  
  /dev/md4 12186190 332724 11853466 3%
  /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD
  
  Sorry, I forgot about the mdmfs hacks I had in my local tinderd. Without
  them, it works fine. So the problem seems to be in mfs rather than zfs.
 
 I should have said mdmfs, but all that's doing is running mdconfig and
 newfs for me. I've reproduced the issue without mdmfs:
 
 % mdconfig -a -t swap -s 12G -u 4
 % newfs -m 0 -o time /dev/md4
 [...]
 % mount /dev/md4 /tmp/foobar
 % cp -p /usr/local/tinderbox/scripts/lib/buildscript /tmp/foobar
 cp: failed to set acl entries for /tmp/foobar/buildscript: Operation not
 supported
 
 Without -p it works fine. FWIW:
 
 % getfacl /usr/local/tinderbox/scripts/lib/buildscript
 # file: /usr/local/tinderbox/scripts/lib/buildscript
 # owner: root
 # group: wheel
 owner@:--:--:deny
 owner@:rwxp---A-W-Co-:--:allow
 group@:-w-p--:--:deny
 group@:r-x---:--:allow
  everyone@:-w-p---A-W-Co-:--:deny
  everyone@:r-x---a-R-c--s:--:allow
 
 Any suggestions on where the problem could be?

At first glance it looks like acl_set_fd_np(3) isn't working on an
md-backed filesystem; specifically, it's returning EOPNOTSUPP.  You
should be able to reproduce the problem by doing a setfacl on something
in /tmp/foobar.

Looking through src/bin/cp/utils.c, this is the code:

420 if (acl_set_fd_np(dest_fd, acl, acl_type)  0) {
421 warn(failed to set acl entries for %s, to.p_path);
422 acl_free(acl);
423 return (1);
424 }

EOPNOTSUPP for acl_set_fd_np(3) is defined as:

 [EOPNOTSUPP]   The file system does not support ACL retrieval.

This would be referring to the destination filesystem.

Looking through the md(4) source for references to EOPNOTSUPP, we do
find some references:

$ egrep -n -r EOPNOTSUPP|ENOTSUP /usr/src/sys/dev/md
/usr/src/sys/dev/md/md.c:423:   return (EOPNOTSUPP);
/usr/src/sys/dev/md/md.c:475:   error = EOPNOTSUPP;
/usr/src/sys/dev/md/md.c:523:   return (EOPNOTSUPP);
/usr/src/sys/dev/md/md.c:601:   return (EOPNOTSUPP);
/usr/src/sys/dev/md/md.c:731:   error = EOPNOTSUPP;

Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any
BIO operation other than READ/WRITE/DELETE.  Line 475 is a continuation
of that.

Line 508 is within mdstart_vnode(), behaving effectively the same as
line 423.  Line 601 is within mdstart_swap(), behaving effectively the
same as line 423.

Line 731 is within md_kthread(), and indicates only BIO operation
BIO_GETATTR is supported.  This would not be an ACL attribute thing,
but rather getting attributes of the backing device itself.  The code
hints at that:

 722 if (bp-bio_cmd == BIO_GETATTR) {
 723 if ((sc-fwsectors  sc-fwheads 
 724 (g_handleattr_int(bp, GEOM::fwsectors,
 725 sc-fwsectors) ||
 726 g_handleattr_int(bp, GEOM::fwheads,
 727 sc-fwheads))) ||
 728 g_handleattr_int(bp, GEOM::candelete, 1))
 729 error = -1;
 730 else
 731 error = EOPNOTSUPP;
 732 } else {

This leaves me with some ideas; just tossing them out here...

1. Maybe/somehow this is caused by swap being used as the backing
   type/store for md(4)?  Try using mdconfig -t malloc -o reserve
   instead, temporarily anyway.

2. Are you absolutely 100% sure the kernel you're using was built
   with options UFS_ACL defined in it?  Doing a strings -a
   /boot/kernel/kernel | grep UFS_ACL should suffice.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo

Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!)

2011-03-06 Thread Jeremy Chadwick
On Sun, Mar 06, 2011 at 11:06:09AM -0500, Steve Wills wrote:
 On 03/06/11 10:37, Jeremy Chadwick wrote:
  
  At first glance it looks like acl_set_fd_np(3) isn't working on an
  md-backed filesystem; specifically, it's returning EOPNOTSUPP.  You
  should be able to reproduce the problem by doing a setfacl on something
  in /tmp/foobar.
  
  Looking through src/bin/cp/utils.c, this is the code:
  
  420 if (acl_set_fd_np(dest_fd, acl, acl_type)  0) {
  421 warn(failed to set acl entries for %s, to.p_path);
  422 acl_free(acl);
  423 return (1);
  424 }
  
  EOPNOTSUPP for acl_set_fd_np(3) is defined as:
  
   [EOPNOTSUPP]   The file system does not support ACL retrieval.
  
  This would be referring to the destination filesystem.
  
  Looking through the md(4) source for references to EOPNOTSUPP, we do
  find some references:
  
  $ egrep -n -r EOPNOTSUPP|ENOTSUP /usr/src/sys/dev/md
  /usr/src/sys/dev/md/md.c:423:   return (EOPNOTSUPP);
  /usr/src/sys/dev/md/md.c:475:   error = EOPNOTSUPP;
  /usr/src/sys/dev/md/md.c:523:   return (EOPNOTSUPP);
  /usr/src/sys/dev/md/md.c:601:   return (EOPNOTSUPP);
  /usr/src/sys/dev/md/md.c:731:   error = EOPNOTSUPP;
  
  Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any
  BIO operation other than READ/WRITE/DELETE.  Line 475 is a continuation
  of that.
  
  Line 508 is within mdstart_vnode(), behaving effectively the same as
  line 423.  Line 601 is within mdstart_swap(), behaving effectively the
  same as line 423.
  
  Line 731 is within md_kthread(), and indicates only BIO operation
  BIO_GETATTR is supported.  This would not be an ACL attribute thing,
  but rather getting attributes of the backing device itself.  The code
  hints at that:
  
   722 if (bp-bio_cmd == BIO_GETATTR) {
   723 if ((sc-fwsectors  sc-fwheads 
   724 (g_handleattr_int(bp, GEOM::fwsectors,
   725 sc-fwsectors) ||
   726 g_handleattr_int(bp, GEOM::fwheads,
   727 sc-fwheads))) ||
   728 g_handleattr_int(bp, GEOM::candelete, 1))
   729 error = -1;
   730 else
   731 error = EOPNOTSUPP;
   732 } else {
 
 Thanks for the investigation! So this seems to be a bug in md? That's
 too bad, I was enjoying using it to make my tinderbox builds faster.

Sorry, I should have been more clear -- my investigation wasn't to
determine if the issue you're reporting was a bug or not, but more along
the lines of hmm, where is userland getting EOPNOTSUPP from in the
kernel in this situation?  It could be that some piece hasn't been
implemented somewhere yet (more an incomplete than a bug :-) ).

I tend to trace source the way I did above in hopes that someone (kernel
dev, etc.) will chime in and go Oh, yes, THAT... let me tell you about
that!  It's also for educational purposes; I figure sharing the innards
along with some simple descriptions might help people feel more
comfortable (vs. thinking everything is a black box; don't let the magic
smoke out!).  Sometimes digging through the code helps.

  This leaves me with some ideas; just tossing them out here...
  
  1. Maybe/somehow this is caused by swap being used as the backing
 type/store for md(4)?  Try using mdconfig -t malloc -o reserve
 instead, temporarily anyway.
 
 Seems to be the same.

I'm not too surprised, but at least that rules out swap vs.
non-block-device stuff being somehow responsible.

I'm not a user of ACLs myself, but Robert Watson might know what's up
with this, or where to go looking.  I've CC'd him here.

  2. Are you absolutely 100% sure the kernel you're using was built
 with options UFS_ACL defined in it?  Doing a strings -a
 /boot/kernel/kernel | grep UFS_ACL should suffice.
  
 
 Yep, it does:
 
 % strings -a /boot/kernel/kernel | grep UFS_ACL
 options UFS_ACL
 
 (My kernel config is just include GENERIC then a bunch of nooptions
 for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.)

Cool, good to rule out the obvious.  Thanks.

The only other thing I can think of off the top of my head would be to
ktrace -t+ -i the cp -p, then provide output of kdump -s -t+ after.
I wouldn't say go about this quite yet (it may not even help determine
what's going on); maybe wait for Robert to take a look first.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org

Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!)

2011-03-06 Thread Jeremy Chadwick
On Sun, Mar 06, 2011 at 08:23:42AM -0800, Jeremy Chadwick wrote:
 On Sun, Mar 06, 2011 at 11:06:09AM -0500, Steve Wills wrote:
  On 03/06/11 10:37, Jeremy Chadwick wrote:
   
   At first glance it looks like acl_set_fd_np(3) isn't working on an
   md-backed filesystem; specifically, it's returning EOPNOTSUPP.  You
   should be able to reproduce the problem by doing a setfacl on something
   in /tmp/foobar.
   
   Looking through src/bin/cp/utils.c, this is the code:
   
   420 if (acl_set_fd_np(dest_fd, acl, acl_type)  0) {
   421 warn(failed to set acl entries for %s, to.p_path);
   422 acl_free(acl);
   423 return (1);
   424 }
   
   EOPNOTSUPP for acl_set_fd_np(3) is defined as:
   
[EOPNOTSUPP]   The file system does not support ACL retrieval.
   
   This would be referring to the destination filesystem.
   
   Looking through the md(4) source for references to EOPNOTSUPP, we do
   find some references:
   
   $ egrep -n -r EOPNOTSUPP|ENOTSUP /usr/src/sys/dev/md
   /usr/src/sys/dev/md/md.c:423:   return (EOPNOTSUPP);
   /usr/src/sys/dev/md/md.c:475:   error = EOPNOTSUPP;
   /usr/src/sys/dev/md/md.c:523:   return (EOPNOTSUPP);
   /usr/src/sys/dev/md/md.c:601:   return (EOPNOTSUPP);
   /usr/src/sys/dev/md/md.c:731:   error = 
   EOPNOTSUPP;
   
   Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any
   BIO operation other than READ/WRITE/DELETE.  Line 475 is a continuation
   of that.
   
   Line 508 is within mdstart_vnode(), behaving effectively the same as
   line 423.  Line 601 is within mdstart_swap(), behaving effectively the
   same as line 423.
   
   Line 731 is within md_kthread(), and indicates only BIO operation
   BIO_GETATTR is supported.  This would not be an ACL attribute thing,
   but rather getting attributes of the backing device itself.  The code
   hints at that:
   
722 if (bp-bio_cmd == BIO_GETATTR) {
723 if ((sc-fwsectors  sc-fwheads 
724 (g_handleattr_int(bp, GEOM::fwsectors,
725 sc-fwsectors) ||
726 g_handleattr_int(bp, GEOM::fwheads,
727 sc-fwheads))) ||
728 g_handleattr_int(bp, GEOM::candelete, 
   1))
729 error = -1;
730 else
731 error = EOPNOTSUPP;
732 } else {
  
  Thanks for the investigation! So this seems to be a bug in md? That's
  too bad, I was enjoying using it to make my tinderbox builds faster.
 
 Sorry, I should have been more clear -- my investigation wasn't to
 determine if the issue you're reporting was a bug or not, but more along
 the lines of hmm, where is userland getting EOPNOTSUPP from in the
 kernel in this situation?  It could be that some piece hasn't been
 implemented somewhere yet (more an incomplete than a bug :-) ).
 
 I tend to trace source the way I did above in hopes that someone (kernel
 dev, etc.) will chime in and go Oh, yes, THAT... let me tell you about
 that!  It's also for educational purposes; I figure sharing the innards
 along with some simple descriptions might help people feel more
 comfortable (vs. thinking everything is a black box; don't let the magic
 smoke out!).  Sometimes digging through the code helps.
 
   This leaves me with some ideas; just tossing them out here...
   
   1. Maybe/somehow this is caused by swap being used as the backing
  type/store for md(4)?  Try using mdconfig -t malloc -o reserve
  instead, temporarily anyway.
  
  Seems to be the same.
 
 I'm not too surprised, but at least that rules out swap vs.
 non-block-device stuff being somehow responsible.
 
 I'm not a user of ACLs myself, but Robert Watson might know what's up
 with this, or where to go looking.  I've CC'd him here.
 
   2. Are you absolutely 100% sure the kernel you're using was built
  with options UFS_ACL defined in it?  Doing a strings -a
  /boot/kernel/kernel | grep UFS_ACL should suffice.
   
  
  Yep, it does:
  
  % strings -a /boot/kernel/kernel | grep UFS_ACL
  options UFS_ACL
  
  (My kernel config is just include GENERIC then a bunch of nooptions
  for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.)
 
 Cool, good to rule out the obvious.  Thanks.
 
 The only other thing I can think of off the top of my head would be to
 ktrace -t+ -i the cp -p, then provide output of kdump -s -t+ after.
 I wouldn't say go about this quite yet (it may not even help determine
 what's going on); maybe wait for Robert to take a look first.

It would help if I actually added Robert to the CC list, wouldn't it?
:-)

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com

Re: HEADS UP: ZFSv28 is in!

2011-02-28 Thread Jeremy Chadwick
On Sun, Feb 27, 2011 at 09:29:57PM +0100, Pawel Jakub Dawidek wrote:
 I just committed ZFSv28 to HEAD.

Thank you so much for this effort!  I look forward to trying this once
it's MFC'd to RELENG_8 in the upcoming future.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: About panic: bufwrite: buffer is not busy???

2011-02-20 Thread Jeremy Chadwick
On Sun, Feb 20, 2011 at 10:30:52AM -0500, Mike Tancsa wrote:
 On 2/20/2011 9:33 AM, Andrey Smagin wrote:
  On week -current I have same problem, my box paniced every 2-15 min. I 
  resolve problem by next steps - unplug network connectors from 2 intel em 
  (82574L) cards. I think last time that mpd5 related panic, but mpd5 work 
  with another re interface interated on MB. I think it may be em related 
  panic, or em+mpd5.
 
 The latest panic I saw didnt have anything to do with em.  Are you sure
 your crashes are because of the nic drive ?

Not to mention, the error string the OP provided (see Subject) is only
contained in one file: sys/ufs/ffs/ffs_vfsops.c, function
ffs_bufwrite().  So, that would be some kind of weird filesystem-related
issue, not NIC-specific.  I have no idea how to debug said problem.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: TTY task group scheduling

2010-11-19 Thread Jeremy Chadwick
On Fri, Nov 19, 2010 at 02:18:52PM +, Vincent Hoffman wrote:
 On 19/11/2010 12:42, Eric Masson wrote:
  Bruce Cran br...@cran.org.uk writes:
 
  Hello,
 
  Google suggests that the work was a GSoC project in 2005 on a pluggable
  disk scheduler.
  It seems that something similar has found its way in DFlyBSD, dsched.
 And indeed to FreeBSD, man gsched. Added sometime round April
 http://svn.freebsd.org/viewvc/base/head/sys/geom/sched/README?view=log

It's been pointed out on the list a couple times, and I've sent mail to
the authors about this, that gsched breaks (very, very badly) things
like sysinstall, and does other strange things like leaves trailing
periods at the end of its .sched. labels.  This appears to be by
design, but I'm still left thinking ?!  It's hard to discern technical
innards/workings of GEOM since the documentation is so poor (and reading
the code doesn't help, especially with regards to libgeom).

IMHO, the gsched stuff, as a layer, should probably be moved into
the I/O framework by default, with the functionality *disabled* by
default and tunables to adjust it.  That's just how I feel about it.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Jeremy Chadwick
On Mon, Sep 06, 2010 at 03:17:42PM +0300, Andriy Gapon wrote:
 on 29/08/2010 12:25 Andriy Gapon said the following:
  The below patch is against sources in FreeBSD tree, it should be applied
  either to sys/amd64/amd64/mp_machdep.c or sys/i386/i386/mp_machdep.c 
  depending
  on the desired architecture:
  http://people.freebsd.org/~avg/intel-cpu-topo.diff
 
 I see that I am not getting as many testers as I expected, so I am going to 
 commit
 the patch.
 
 You still have a short while to either objectively object to the patch or to
 voluntary test it :-)

I would gladly assist in testing this, except there doesn't appear to be
an authoritative statement that it will apply to RELENG_8; when I see
WIP, I assume -CURRENT/HEAD only.

Let me know, since all the systems I have are Intel multi-core.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Jeremy Chadwick
On Mon, Sep 06, 2010 at 03:56:01PM +0300, Andriy Gapon wrote:
 on 06/09/2010 15:23 Jeremy Chadwick said the following:
  On Mon, Sep 06, 2010 at 03:17:42PM +0300, Andriy Gapon wrote:
  on 29/08/2010 12:25 Andriy Gapon said the following:
  The below patch is against sources in FreeBSD tree, it should be applied
  either to sys/amd64/amd64/mp_machdep.c or sys/i386/i386/mp_machdep.c 
  depending
  on the desired architecture:
  http://people.freebsd.org/~avg/intel-cpu-topo.diff
 
  I see that I am not getting as many testers as I expected, so I am going 
  to commit
  the patch.
 
  You still have a short while to either objectively object to the patch or 
  to
  voluntary test it :-)
  
  I would gladly assist in testing this, except there doesn't appear to be
  an authoritative statement that it will apply to RELENG_8; when I see
  WIP, I assume -CURRENT/HEAD only.
 
 patch -C is much better than any statement :)
 
  Let me know, since all the systems I have are Intel multi-core.
 
 Yes, the patch should be applicable to stable/8 without any issues.

Great, thanks!  I'll be testing this out on two separate systems, both
RELENG_8:

- Supermicro X7SBA + Intel C2D E8400 (stepping 10)
- Supermicro X7SBL-LN2 + Intel C2D E6600 (stepping 6)

I'll make sure to provide what the topology looks like before and after.
Is CPU-relevant dmesg output sufficient?

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Jeremy Chadwick
On Mon, Sep 06, 2010 at 04:28:02PM +0300, Andriy Gapon wrote:
 on 06/09/2010 16:12 Jeremy Chadwick said the following:
  Great, thanks!  I'll be testing this out on two separate systems, both
  RELENG_8:
  
  - Supermicro X7SBA + Intel C2D E8400 (stepping 10)
  - Supermicro X7SBL-LN2 + Intel C2D E6600 (stepping 6)
  
  I'll make sure to provide what the topology looks like before and after.
  Is CPU-relevant dmesg output sufficient?
 
 If you mean something like the below, then yes.  Thanks!
 [...]

All done.  Good news (I think): there's no difference in the CPU-related
topology on either system with your patch, aside from kernel build date.
The topologies are still detected correctly.  In case you want them:

Supermicro X7SBA
Intel C2D E8400 (stepping 10)
===
Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.1-STABLE #0: Mon Sep  6 09:06:52 PDT 2010
r...@icarus.home.lan:/usr/obj/usr/src/sys/X7SBA_RELENG_8_amd64 amd64
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Duo CPU E8400  @ 3.00GHz (2992.52-MHz K8-class CPU)
  Origin = GenuineIntel  Id = 0x1067a  Family = 6  Model = 17  Stepping = 10
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  
Features2=0x408e3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE
  AMD Features=0x20100800SYSCALL,NX,LM
  AMD Features2=0x1LAHF
  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 4112097280 (3921 MB)
ACPI APIC Table: PTLTD  APIC  
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
kbd1 at kbdmux0
ichwd module loaded
acpi0: PTLTDXSDT on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0



Supermicro X7SBL-LN2
Intel C2D E6600 (stepping 6)
==
Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.1-STABLE #1: Mon Sep  6 07:59:49 PDT 2010
r...@gujoja.home.lan:/usr/obj/usr/src/sys/X7SBL_RELENG_8_amd64 amd64
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 CPU  6600  @ 2.40GHz (2394.01-MHz K8-class CPU)
  Origin = GenuineIntel  Id = 0x6f6  Family = 6  Model = f  Stepping = 6
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0xe3bdSSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
  AMD Features=0x20100800SYSCALL,NX,LM
  AMD Features2=0x1LAHF
  TSC: P-state invariant
real memory  = 8589934592 (8192 MB)
avail memory = 8261648384 (7878 MB)
ACPI APIC Table: PTLTD  APIC  
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 Version 2.0 irqs 0-23 on motherboard
kbd1 at kbdmux0
ichwd module loaded
acpi0: PTLTDXSDT on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0


All other systems I have are C2D and C2Q-based, but I can't easily test
on those given their production roles.  If there's a particular Intel
processor family/model you're interested in, let me know and I can dig
around to see if I have access to one.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Watchdog resets on 82575

2010-08-10 Thread Jeremy Chadwick
On Tue, Aug 10, 2010 at 10:30:21AM +0100, Steven Hartland wrote:
 Is there an easy way to check which chip is present as the startup doesnt
 seem to mention it?

Not during start-up, but once the machine is running (including in
single-user), you can do:

pciconf -lvc

And look for device igb0.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Watchdog resets on 82575

2010-08-10 Thread Jeremy Chadwick
On Tue, Aug 10, 2010 at 11:23:26AM +0100, Steven Hartland wrote:
 Thanks Jeremy, from that we get:-
 
 i...@pci0:1:0:0:class=0x02 card=0x060015d9 chip=0x10c98086 
 rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
class  = network
subclass   = ethernet
cap 01[40] = powerspec 3  supports D0 D3  current D0
cap 05[50] = MSI supports 1 message, 64 bit, vector masks
cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x4(x4)
 i...@pci0:1:0:1:class=0x02 card=0x060015d9 chip=0x10c98086 
 rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
class  = network
subclass   = ethernet
cap 01[40] = powerspec 3  supports D0 D3  current D0
cap 05[50] = MSI supports 1 message, 64 bit, vector masks
cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x4(x4)
 
 I assume there is a way to convert from the hex values to the human value
 but not sure what it is?

The card and chip identifiers are part of the PCI ID specification.
You can see what the human value is by examining the source code for
the driver.  Sometimes it's easy to figure out, other times there's a
series of #define's which you have to reverse engineer.

In this case, there's two places with relevant information:

src/sys/dev/e1000/if_igb.c
src/sys/dev/e1000/e1000_hw.h

You have to split the Chip ID into two separate 16-bit portions, so
0x10c9 and 0x8086.

0x8086 is Intel's vendor code.  0x10c9 is the device ID of the
individual NIC/model type.  So:

$ grep -i 0x10c9 *
e1000_hw.h:#define E1000_DEV_ID_825760x10C9

For Jack: igb_vendor_info_array should really be extended to include
actual ASCII strings for the individual chips/models/codenames.  I'm
sure that's on your todo list somewhere.  I'd be willing to write this
but would need a list of the models (or maybe the Linux driver has them
in comments, etc. and I could go off of that).

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Results of BIND RFC

2010-04-02 Thread Jeremy Chadwick
On Fri, Apr 02, 2010 at 09:24:51AM +, Poul-Henning Kamp wrote:
 In message 20100402021715.669838e0.s...@freebsd.org, Stanislav Sedov writes:
 On Fri, 02 Apr 2010 08:55:07 +
 Poul-Henning Kamp p...@phk.freebsd.dk mentioned:
 
 Sorry, I think I was not clear enough.
 
 Sorry for misunderstanding.
 
 Yes, the case can certainly be made that DNS query tool belongs in the
 base system.

I disagree (so what else is new?)  It should be kept out of the base
system.  KISS:

Doug pulling BIND out of the base system / going ports-only = excellent.

Doug making a separate port for BIND-esque DNS query/maintenance tools =
excellent.

Both of the above can be made into packages.  Vendors who use FreeBSD
can incorporate said package(s) into their build infrastructure.  Folks
who do not have Internet connections (yet for some reason want said DNS
tools) can install the package(s) from CD/DVD/USB.

I want the bikeshed to be black.  :-)


[1]: FreeBSD really needs to move away from the base system as a
concept, as I've ranted about in the past.  Or if it cannot, the base
system needs to start using pkg_* (somehow) for use, and src.conf
WITHOUT_xxx (where xxx = some software) removed.  Concept being: I
don't need Kerberos; pkg_delete base-krb5.  I also don't need lib32;
pkg_delete base-lib32.  Beautiful concept, hard to implement due to
libraries being yanked out from underneathe binaries that are linked to
them.  But you get the idea.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org